Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

Honeycomb Metrics Is Now Generally Available

It’s Black Friday. Checkout latency is spiking. Your on-call engineer pulls up the dashboard and starts working through the list. Is it a regional issue? No, all regions look fine. A payment provider? Stripe, PayPal, Apple Pay all nominal. A bad deployment? Nothing shipped in the last six hours. All your infrastructure dashboards are showing green. But customers are complaining. Checkout is slow, carts are being abandoned and revenue is draining away.

What's New at Cribl 4.17: On release days, we wear teal.

In this episode, Leon runs through all the updates in Cribl release 2603, which includes a massive update to Cribl Search, the ability to detect PII and secrets in the background as part of Cribl Guard, and two cool enhancements to Cribl Packs - monitoring and enhanced routing. Try Cribl Now! Sandboxes let you get hands-on experience with Cribl without the fuss or friction.

What is Cribl Guard background detection?

Security and compliance teams need to know exactly what sensitive data is flowing through their environments and where it’s going. ​​Because surprise PII is no one’s favorite kind of surprise. Meanwhile, upstream teams are shipping new apps, changing schemas, adding fields, and generally moving fast. However, you can only manage and protect the data you currently know of and expect. But sensitive data has a habit of showing up where no one expected it…

Meet the new Cribl Search: Faster investigations with AI

Get a quick look at the new Cribl Search experience—built to help teams investigate faster, onboard data easily, and get answers from their logs without complex query languages. In this quick overview, we show how Cribl Search helps you move from raw data to insights in minutes: The result? Faster investigations, simpler workflows, and powerful AI-assisted analysis across your telemetry. Learn how the new Cribl Search makes exploring and analyzing data easier for everyone—from experienced analysts to teams just getting started.

The best observability platforms for developers

At some point, logs stop being enough. As applications grow more distributed, understanding what's actually happening in production becomes harder. That's what observability platforms are built for. The hard part is figuring out which one is actually right for your application — and your budget. This guide covers some popular options: what they do well, where they fall short, and who they're for.

Olly for SREs: 3 ways I actually use it in production

There’s a moment after an alert where you’re not fixing anything yet. You’re trying to answer a much simpler question: Is it actually down? Sometimes it’s obvious. Sometimes it’s 20 alerts at once with no clear starting point. Sometimes it’s a small upstream degradation that might cascade. Sometimes it’s just a spike that resolves on its own. That first phase is orientation. Is the signal real or transient? Is it isolated or spreading? Root cause or symptom?

Create a Custom Service Health Board With the Honeycomb MCP

Your software is sending data to Honeycomb. Now where is the dashboard you want? The best dashboard is one created just for your application, or your service, or your team. You can get that in minutes with the Honeycomb MCP. Open your coding agent in your IDE, or on the command line in your code repository. Configure the Honeycomb MCP and authenticate with Read and Write permissions. Now tell it what you want. You can be high-level: Make me a service health board for the frontend service.

Approaching your observability migration with the right mindset

This guest blog post is authored by Nick Vecellio, Principal Engineer and Co-founder of NoBS, a Premier Datadog Partner specializing in hands-on Datadog migrations and optimizations. At NoBS, we help enterprises migrate their observability stack to Datadog. Teams often come to us after a migration has technically “worked,” but the new setup requires optimization tweaks to provide the clarity, reliability, or operational benefits they’re looking for.

What is Agentic Observability?

Agentic observability is the instrumentation and correlation needed to explain and control agent behavior across multi-step workflows. Legacy observability focuses on runtime health and service behavior. You monitor metrics like CPU usage, memory, latency, and error rates to confirm that applications and infrastructure are functioning as expected. When a workflow degrades, the proximate cause is often a crash, timeout, permission error, or resource constraint.

Top 12 AI and LLM Observability Tools in 2026 Compared: Open-Source and Paid

Artificial intelligence has moved far beyond experimentation. In 2026, AI systems are embedded into customer support workflows, clinical decision support tools, fraud detection engines, and internal copilots across nearly every industry. Adoption is accelerating quickly. According to McKinsey, 23% of organizations are already scaling agentic AI systems, while another 39% are actively experimenting with them. Yet the path to reliable production AI remains uncertain.