Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

GenAI Observability in Grafana Cloud: End-to-End Agent Debugging (Demo)

From Observability for GenAI Applications (Grafana OpenTelemetry Community Call) We drill into traces to see which agents called which tools, where errors occurred, how long each LLM call took, and how costs and tokens are distributed. The walkthrough also covers using AI assistance to summarize long traces and identify optimization opportunities in real time..

AI in Production Is Growing Faster Than We Can Trust it

Enterprise software has moved past the generative AI testing phase. Businesses with millions of daily users or workloads are no longer just prototyping LLMs in a vacuum. They’re directly wiring agentic efficiency into product interfaces and infrastructure to stay competitive. This wave is often compared to the spread of microservices in the past, but we aren’t just adding new dependencies and complexity.
Sponsored Post

Breaking Down IT Silos with OpManager Plus's Full-stack observability

In today's complex and dynamic IT landscape, a single application relies on dozens of interconnected services, from physical servers to virtual machines, cloud instances, and third-party APIs. When something goes wrong, a traditional monitoring approach that focuses on individual components is no longer enough. This is where full-stack observability becomes critical. It's the ability to gain a holistic, real-time understanding of your entire technology stack, from the user experience all the way down to the underlying network infrastructure.

Observability That Works: Understand System Failures and Drive Better Business Outcomes

Modern systems don't fail because engineers lack skills; they fail because teams can't see why systems are failing at all or can’t see why they’re failing fast enough. Often, the problem isn't a lack of tools — it's a lack of clear, connected visibility across data, teams, and systems. This is where observability transforms how organizations operate. It's no longer just about keeping systems running.

From Monitoring Signals to Observability Maturity

Efficient monitoring delivers fast results: alerts fire within seconds, dashboards refresh continuously, and teams know the moment something changes. Understanding arrives later. An alert may show that a value shifted, but it does not explain why it shifted, how far the impact will spread, or which components truly matter. Teams see the signal, not the system behavior behind it. This gap defines the limit of traditional monitoring. Detection has improved, but explanation has not kept pace.

Measuring Claude Code ROI and Adoption in Honeycomb

At Honeycomb, we’ve been using Claude Code across our engineering team for a while. Anecdotally, I had a sense of who the power users were, and I had seen some examples of complex usage. But I wanted to be able to confidently answer questions, like: Claude Code supports OpenTelemetry out of the box, which means sending telemetry to Honeycomb takes just a few minutes of configuration.

ChatOps that actually works: Grafana Cloud, Slack, and AI-powered observability

Context switching isn’t just inefficient—under pressure, it’s exhausting. It slows decision-making, increases the risk of mistakes, and makes even experienced engineers feel like they’re always a step behind the system they’re responsible for. At Grafana Labs, we want to build tools that meet you where you are. That's why we embedded Grafana Assistant, our context-aware AI assistant, directly in Grafana Cloud.

Observability for GenAI Applications (Grafana OpenTelemetry Community Call)

In this episode, we’re diving into observability for Generative AI apps. AI helps us write code and monitor applications in production - but how do we observe the AI itself? And how do we make sense of complex, non-deterministic AI systems? We’re joined by two great guests: Ishan Jain, working on GenAI observability and Luccas Quadros, working on Grafana Assistant. Together, they bring both platform-level insights and real-world perspectives.