Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

How to Solve "Cannot Reproduce" Bugs That Cost Support Teams Hours

Support teams frequently face vague customer reports and incomplete data but need to offer fast resolutions autonomously without escalating to developers. In this article, learn how to equip support engineers with tools to diagnose root causes in minutes, increasing self-sufficient issue resolution. We explore eliminating the ‘Reproduction Tax’ for ‘cannot reproduce’ bugs using runtime context to achieve technical certainty at scale.

Evaluating Observability Tools for the AI Era

Every observability vendor has an AI story right now. Most have an MCP. Many have a chatbot. All have a demo where the AI finds the root cause of an incident in thirty seconds and everyone in the room nods. In the context of a public demo, these tools look almost identical. Ask the AI a question, the tool returns an answer, and the engineer fixes the bug. Impressive. But if you buy based on the demo, you may end up with an AI layer that looks great on a call and disappoints in production.

Observability Where You Work: Introducing the Honeycomb Slackbot in Beta

Engineers are constantly context switching between tools, adding cognitive overhead on top of already complex work. You're deep in an investigation, you need to analyze some data, pull up a runbook somewhere else, and share findings back in Slack. Context gets lost in the shuffle, correlating across data sources becomes painful, and everything just takes longer. In high-pressure situations like incidents, that friction has a real cost to the business.

Honeycomb Metrics Is Now Generally Available

It’s Black Friday. Checkout latency is spiking. Your on-call engineer pulls up the dashboard and starts working through the list. Is it a regional issue? No, all regions look fine. A payment provider? Stripe, PayPal, Apple Pay all nominal. A bad deployment? Nothing shipped in the last six hours. All your infrastructure dashboards are showing green. But customers are complaining. Checkout is slow, carts are being abandoned and revenue is draining away.

The best observability platforms for developers

At some point, logs stop being enough. As applications grow more distributed, understanding what's actually happening in production becomes harder. That's what observability platforms are built for. The hard part is figuring out which one is actually right for your application — and your budget. This guide covers some popular options: what they do well, where they fall short, and who they're for.

Olly for SREs: 3 ways I actually use it in production

There’s a moment after an alert where you’re not fixing anything yet. You’re trying to answer a much simpler question: Is it actually down? Sometimes it’s obvious. Sometimes it’s 20 alerts at once with no clear starting point. Sometimes it’s a small upstream degradation that might cascade. Sometimes it’s just a spike that resolves on its own. That first phase is orientation. Is the signal real or transient? Is it isolated or spreading? Root cause or symptom?

What's New at Cribl 4.17: On release days, we wear teal.

In this episode, Leon runs through all the updates in Cribl release 2603, which includes a massive update to Cribl Search, the ability to detect PII and secrets in the background as part of Cribl Guard, and two cool enhancements to Cribl Packs - monitoring and enhanced routing. Try Cribl Now! Sandboxes let you get hands-on experience with Cribl without the fuss or friction.

What is Cribl Guard background detection?

Security and compliance teams need to know exactly what sensitive data is flowing through their environments and where it’s going. ​​Because surprise PII is no one’s favorite kind of surprise. Meanwhile, upstream teams are shipping new apps, changing schemas, adding fields, and generally moving fast. However, you can only manage and protect the data you currently know of and expect. But sensitive data has a habit of showing up where no one expected it…

Meet the new Cribl Search: Faster investigations with AI

Get a quick look at the new Cribl Search experience—built to help teams investigate faster, onboard data easily, and get answers from their logs without complex query languages. In this quick overview, we show how Cribl Search helps you move from raw data to insights in minutes: The result? Faster investigations, simpler workflows, and powerful AI-assisted analysis across your telemetry. Learn how the new Cribl Search makes exploring and analyzing data easier for everyone—from experienced analysts to teams just getting started.

Create a Custom Service Health Board With the Honeycomb MCP

Your software is sending data to Honeycomb. Now where is the dashboard you want? The best dashboard is one created just for your application, or your service, or your team. You can get that in minutes with the Honeycomb MCP. Open your coding agent in your IDE, or on the command line in your code repository. Configure the Honeycomb MCP and authenticate with Read and Write permissions. Now tell it what you want. You can be high-level: Make me a service health board for the frontend service.