Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Coffee and Claude: How Honeycomb MCP Makes AI Work for You

If you caught our recent Introducing Honeycomb MCP: Your AI Agent’s New Superpower webinar, you know it was a lively mix of big ideas, demos, and a few laughs about the messy, fast-moving world of AI. Hosted by Austin Parker, Morgante Pell, and James Bland from AWS, the conversation explored how Honeycomb’s new Model Context Protocol (MCP) is changing the way developers and AI agents interact with data.

Part 1: Digital Twins and Predictive Maintenance

As machines and systems grow more connected and complex, the traditional toolbox for managing them feels increasingly outdated. Engineers and operators need new approaches that match the realities of software-driven products and data-intensive environments. Digital twins provide that leap forward. By creating a virtual model of a physical asset and continuously feeding it with real-time data, digital twins reveal both current performance and likely future outcomes.

Grafana Tempo: Setup, Configuration, and Best Practices

As systems grow, understanding how a request moves across multiple services becomes harder. Traces help bring this picture together by showing the exact path a request takes, along with the timings that matter. Grafana Tempo is built for this kind of workload. It stores traces efficiently, works well with OpenTelemetry, and keeps the operational overhead low.

From Telemetry to Truth: Why Observability Must Be Service-Centric

Modern enterprises depend on systems that appear calm: dashboards glow, availability reads steady, and metrics suggest composure. But the signals only tell part of the story. Conversion softens at the margins, regional sign-in times drift, a compliance report misses an expected field. The puzzle isn’t visibility; it’s meaning. Components describe status; services carry outcomes.

How Datadog is Reinventing On-Call #Datadog #OnCall #DevOps

Datadog is reimagining how engineers handle incidents—moving beyond simple alerts to an intelligent, voice-driven on-call experience. With Datadog On-Call, teams can acknowledge alerts, access runbooks, post to Slack, and collaborate in real time, all before even touching their computer. See how Datadog brings incident response, communication, and automation together so you can respond faster and keep customers informed.

Building Smarter AI Products #Datadog #DASH #AI

AI capabilities are advancing faster than ever — transforming how teams design, build, and ship intelligent products. In this teaser from Building Successful AI-powered Products at Datadog DASH, experts discuss the rise of agent-based systems, evolving model capabilities, and how to stay ahead in the new era of automation.

Safely Roll Out Features with Datadog Feature Flags

In this short demo, see how Datadog Feature Flags help teams release new functionality safely and efficiently. Datadog provides advanced targeting, progressive rollouts, and automatic rollbacks — all integrated with powerful observability data. Learn how you can use simple on–off flags or multi-variant configurations to test and deploy features with confidence. With built-in monitoring of key guardrail metrics, Datadog can automatically pause or reverse rollouts when issues are detected, keeping your releases stable.

Debugging in Elixir with Observer

Erlang's Observer is often discussed in passing and regarded as a curiosity during Elixir courses. However, Observer provides many powerful tools for monitoring and debugging your application, both in development and production. Together, we will learn how to access the Observer GUI and debug a project that leaks memory, both locally and through a remote node. We will set up process tracing and track garbage collections to find the offending code in our sample project. Let's get started!

A different view for the performance timings of an uptime monitor

When you monitor a website at Oh Dear, the monitoring also includes the historical performance insights that belong to that monitor. It gives you a historical overview of the speed of that monitor, allowing you to see anomalies and changes over time. As of today, there's a second view available, one that matches the webbrowser visualisation of the timing of a single request. This view shows the same waterfall information you'd find in Chrome or Firefox, providing a familiar view to developers worldwide.

Logs Are Your Data Platform: Dynamic, Queryable, S3Backed

Modern systems move fast. Features ship daily, user behavior shifts hourly, and risks surface in minutes. In that reality, logs are not just a troubleshooting artifact. They are your most expressive data source. Logs capture the words developers write to their future selves. They carry the full story of requests, users, experiments, errors, feature flags, and revenue events.