Operations | Monitoring | ITSM | DevOps | Cloud

Observe your AI agents: Endtoend tracing with OpenLIT and Grafana Cloud

In another post in this series, we discussed how to instrument large language model (LLM) calls. This can be a good starting point, but generative AI workloads increasingly rely on agents, which are systems that plan, call tools, reason, and act autonomously. And their non‑deterministic behavior makes incidents harder to diagnose, in part, because the same prompt can trigger different tool sequences and costs.

How to monitor LLMs in production with Grafana Cloud,OpenLIT, and OpenTelemetry

Moving a large language model (LLM) application from a demo to a production‑scale service raises very different questions than the ones you ask when playing with an API key in a notebook. In production, you have to answer: How much is each model costing us? Are we keeping latency within our service‑level objectives? Are we accidentally returning hallucinations or toxic content? Is the system vulnerable to prompt‑injection attacks?

What Engineers Want from AI in Observability... According to the 2026 Observability Survey Report

The results show strong interest in AI for forecasting, root cause analysis, onboarding, and generating dashboards, alerts, and queries. But when it comes to autonomous action, practitioners are more cautious — and 95% say AI needs to show its work to earn trust.

Real-Time Data: The Engine of Efficient, Sustainable Data Centers

Imagine knowing every detail of your data center as it happens. Real-time data makes this possible. You can monitor systems, track performance, and adjust resources on the fly. This proactive approach leads to smoother operations and reduced downtime. By constantly having up-to-date information, you can maintain peak efficiency in your facility. Such insights allow you to optimize cooling and power use, which are crucial to keeping costs down.

AI in observability in 2026: Huge potential, lingering concerns

The role of AI in observability is evolving rapidly, but the data from our fourth annual Observability Survey makes one thing abundantly clear: the potential is real, and so are the reservations. Practitioners overwhelmingly see value in using AI to help surface anomalies, forecast and spot trends, assist with root cause analysis, and get new users up to speed quicker.

Open standards in 2026: The backbone of modern observability

Open source software and open standards are now an essential part of how organizations maintain their systems. That's not to say they haven't always been important, but the fourth annual Observability Survey, brought to you by Grafana Labs, shows just how deeply the shift to open has taken hold, with 77% of respondents saying open source and open standards are important1 to their observability strategy.

Engineers Want AI in Observability - With One Catch: 4th Annual Observability Survey by Grafana Labs

Actually useful AI is welcome in observability. AI for the sake of AI is not. In this overview of Grafana Labs’ 4th annual Observability Survey, Marc Chipouras shares what 1,300+ respondents from 76 countries told us about the current state of observability — and what comes next. This year’s survey explores four major themes: The results show strong interest in AI for forecasting, root cause analysis, onboarding, and generating dashboards, alerts, and queries. But when it comes to autonomous action, practitioners are more cautious — and 95% say AI needs to show its work to earn trust.

Bridge the DevSec divide: Using Grafana Cloud and Miggo for runtime protection

Note: This blog post is co-authored by Daniel Shechter, CEO and co-founder of Miggo Security. Modern runtime security is critical to understand complex systems and detect and protect against attacks, especially in rapidly evolving cloud native architectures. For many security teams, however, achieving deep visibility into runtime risks remains a moving target.

Quickly go from exploration to action with new one-click integrations in Grafana Drilldown

The Grafana Drilldown apps gives you a queryless, point-and-click way to explore your metrics, logs, traces, and profiles. But finding an insight is only half the job—you still need to act on it. Previously, that meant leaving Drilldown, manually copying queries, and navigating through Grafana's dashboards, Alerting, and "Explore" interfaces to pick up where you left off.