Faster incident response through distributed tracing: Inside Glovo's use of Traces Drilldown

It’s almost 1 p.m. on a Monday afternoon and you’re hungry. You pull up your meal delivery app and select your favorite restaurant and dish. Then you go to check out and nothing happens. Your frustration mounts as you get hungrier by the minute. But there’s frustration on the other side of that transaction as well—engineers are scrambling to figure out what’s wrong as orders drop and revenue losses rise.

Prometheus Gauges vs Counters: What to Use and When

Choosing the wrong metric type in Prometheus can lead to inaccurate dashboards, false positives in alerting, and missed indicators of system failure. Gauge metrics are intended for tracking values that can go up and down, such as memory usage, queue depth, or the number of active connections. Unlike counters, which only increment (or reset on restart), gauges reflect the current state of a resource at scrape time.

Going beyond AI chat response: How we're building an agentic system to drive Grafana

As we look at the role AI can play in Grafana going forward, we want to move beyond the simple chat responses that dominate the world of LLMs today and into agentic systems—AI that can understand, reason, and act on your behalf. The ultimate goal is to make it easy to get things done in Grafana using natural language—whether you’re a seasoned SRE or a new developer. And in the AI world, we call this moving from chat completion to task completion.

Operational Intelligence - the new horizon of observability

Monitoring your systems isn't enough anymore. Neither is “asking questions about your system”. Operational Intelligence embraces observability to proactively deliver business insights, support decision-making, and accelerate innovation. It seems that as the observability market grows and more and more products come into the space, the meaning of the term observability itself becomes more and more nebulous.

How Dropbox rebuilt its logging stack with Grafana Loki after a data center went dark

Two years ago, a power outage knocked a Dropbox data center offline. It wasn’t just any data center. It was the only one where Dropbox hosted Grafana Loki, meaning engineers couldn’t access their log data. “We had considered a data center outage when we were rolling out Loki, but it had just never risen up in priority enough to get put into multiple data centers,” said Chris Hodges, an infrastructure software engineer at the cloud storage company.

How to detect vulnerable GitHub Actions at scale with Zizmor

As we previously reported on April 26, 2025, we had a security incident via an insecure GitHub Action and we have since published a post-incident review. We have confirmed that there has been no code modification, unauthorized access to production systems, exposure of customer data, or access to personal information.

The Road to Loki 4.0 (Loki Community Call June 2025)

In this Loki Community Call, we welcome back Ed Welch, Principal Engineer on the Loki team. We will be discussing with Ed what is next for Loki as we push forward to Loki 4.0. If you are interested, learn more about potential architecture changes, storage formats, and an open discussion on where Ed and the Loki team would like to see the future of Loki, then make sure you join us live and have your questions answered!

Observability Across Asia-Pacific: What's Holding Teams Back? | 2025 Observability Survey Analysis

What’s holding back observability maturity in Asia-Pacific? Grafana Labs' cofounder Anthony Woods shares key takeaways from the largest global observability survey. Learn how SaaS, budget concerns, and org structure are shaping Asia-Pacific (APAC)'s future. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more.

Grafana Cloud updates: The latest features in Kubernetes Monitoring, Fleet Management, and more

We consistently roll out helpful updates and fun features in Grafana Cloud, our fully managed observability platform powered by the open source Grafana LGTM Stack ( Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics). In case you missed them, here’s our monthly round-up of the latest and greatest Grafana Cloud updates.