Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

Production Is Where the Rigor Goes

In early February, Martin Fowler and the good folks at Thoughtworks sponsored a small, invite-only unconference in Deer Valley, Utah—birthplace of the Agile Manifesto—to talk about how software engineering is changing in the AI-native era. They recently published a summary of key insights and themes from the summit, sorted into ten topical buckets.

AI in observability in 2026: Huge potential, lingering concerns

The role of AI in observability is evolving rapidly, but the data from our fourth annual Observability Survey makes one thing abundantly clear: the potential is real, and so are the reservations. Practitioners overwhelmingly see value in using AI to help surface anomalies, forecast and spot trends, assist with root cause analysis, and get new users up to speed quicker.

Open standards in 2026: The backbone of modern observability

Open source software and open standards are now an essential part of how organizations maintain their systems. That's not to say they haven't always been important, but the fourth annual Observability Survey, brought to you by Grafana Labs, shows just how deeply the shift to open has taken hold, with 77% of respondents saying open source and open standards are important1 to their observability strategy.

Engineers Want AI in Observability - With One Catch: 4th Annual Observability Survey by Grafana Labs

Actually useful AI is welcome in observability. AI for the sake of AI is not. In this overview of Grafana Labs’ 4th annual Observability Survey, Marc Chipouras shares what 1,300+ respondents from 76 countries told us about the current state of observability — and what comes next. This year’s survey explores four major themes: The results show strong interest in AI for forecasting, root cause analysis, onboarding, and generating dashboards, alerts, and queries. But when it comes to autonomous action, practitioners are more cautious — and 95% say AI needs to show its work to earn trust.

Shifting Metrics Right

In the shift left era where it feels like we’re pushing everything as far to the start of the SDLC as we can, it may seem counterintuitive to shift anything right. That is, however, exactly what I suggest when it comes to generating metrics. How far you go to the right of the SDLC is a much more nuanced question and is dependent on a lot of factors, and on what metrics you’re talking about.

Instrumenting Rust TLS with eBPF

Coroot is an open source observability tool that uses eBPF to collect telemetry directly from applications and infrastructure. One of the things it does is capture L7 traffic from TLS connections without any code changes, by hooking into TLS libraries and syscalls. Works great for OpenSSL. Works for Go. Then rustls enters the picture and everything stops being obvious. With OpenSSL, everything is nicely wrapped: From eBPF’s point of view this is perfect: Everything happens inside one call.

Observability for distributed IoT systems: reducing alert fatigue through modular architecture

Many distributed IoT teams hit the same wall at roughly the same stage. The fleet grows, telemetry coverage improves, dashboards multiply, and on paper the system becomes more visible. In practice, the operating picture often gets harder to read. There are more alerts to review, more exceptions that do not fit existing runbooks, more cases where someone has to cross-check device state against backend logs and integration behavior by hand. What starts to slip is not only response speed, but confidence. The team sees more signals, yet feels less sure which ones matter and which ones can wait.

Golang memory arenas [101 guide]

Go 1.20 introduced an experimental arena package that lets you allocate many objects from a contiguous region of memory and free them all at once — bypassing the garbage collector entirely. The package remains experimental and its future is uncertain, but arenas are a valuable concept for understanding Go memory management and writing high-performance code. The arena package is experimental and on hold indefinitely. The Go team has made no guarantees about compatibility or its continued existence.

Observability vs Monitoring: Why the Difference Still Matters in Complex Systems

In modern infrastructure, the words observability and monitoring are often used as if they mean the same thing. That shortcut sounds harmless, but it creates real confusion inside technical teams and business discussions. The two ideas are connected, yet they solve different problems. In simple systems, the gap may feel small. In complex systems, the gap becomes impossible to ignore because the cost of misunderstanding it usually appears during failure, not during routine operation.

Evaluating Observability Tools for the AI Era

Every observability vendor has an AI story right now. Most have an MCP. Many have a chatbot. All have a demo where the AI finds the root cause of an incident in thirty seconds and everyone in the room nods. In the context of a public demo, these tools look almost identical. Ask the AI a question, the tool returns an answer, and the engineer fixes the bug. Impressive. But if you buy based on the demo, you may end up with an AI layer that looks great on a call and disappoints in production.