Operations | Monitoring | ITSM | DevOps | Cloud

New in the Honeycomb Academy: Learn to Use the Honeycomb MCP

Two things happen when engineers first connect the Honeycomb MCP to their AI assistant. The first is the blank page problem. The Honeycomb UI gives you something to react to: a heatmap, a query builder, a trace to click into. An AI assistant gives you a cursor and nothing else. When you don't know where to start, that's a hard place to be. The second shows up right after you get past the first one. You ask a question, you get a confident-sounding answer, and you're not sure whether to trust it.

Two AI agents, one incident: Rocky AI comes to the terminal

A Playwright Check fails at 2 am. The login flow is broken. Until today, that alert triggered a human to get up, open the Checkly dashboard, copy Rocky AI root cause analysis (RCA), and then tell an agent to get to work. There were two AI agents, one incident, and no way for them to talk to each other. The extended checkly checks and new checkly rca CLI commands close that gap. Your coding agent can now pull Rocky AI's analysis into its ongoing work, read the diagnosis, and go fix the code.

VM Migration to Kubernetes: What Breaks and How to Prevent It

Here is what nobody putting together the business case for a VM migration to Kubernetes will tell you upfront: the compute is the easy part. Moving workloads off vSphere and onto Kubernetes is conceptually straightforward. The tooling has matured. The architecture is proven. Compute moves, storage remaps, and the platform team has a plan. The network is where projects quietly stall.

How to run a proof of concept that de-risks your monitoring decision

Part 3, key insights from a fireside chat with Chris Yates. Read part 1 here, and part 2 here. Most database monitoring proof of concepts (POCs) answer the wrong questions. Here's how to structure a proof of concept that genuinely de-risks your vendor decision with the questions to ask during the process. A POC is often treated as the final hurdle in vendor evaluation, but too often, it becomes theatre. A guided tour of the flashiest features, run by one person, under unrealistic conditions.

End-to-End Trace Propagation Across SQS and Lambda with OpenTelemetry

SQS doesn't propagate trace context automatically. You instrument both sides, deploy, and get two disconnected traces. This post shows how to wire them into one waterfall — and the ESM format gotcha that silently breaks it every time. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

5 Best SOC 2 Continuous Monitoring Tools for SaaS: Closing the 20% Manual Evidence Gap

Landing a big-logo customer feels great-until their security questionnaire hits your inbox. For most B2B SaaS teams, SOC 2 compliance is the roadblock. You connect a tool, dashboards turn green, and then stall: about 20% of evidence still needs screenshots, sign-offs, or frantic Slack chases. That last-mile grind drags engineers back into spreadsheets just when the audit seems done.

Why Copilot alone won't fix your business workflows

Microsoft has been pushing Copilot hard over the past year. Between the rebrand of Office to Microsoft 365 Copilot, the launch of Copilot Tasks, and the more recent arrival of Copilot Cowork, there is a clear message: AI is supposed to handle the heavy lifting. For many businesses, though, the reality is more complicated than the marketing suggests. Copilot is a strong productivity tool within its own ecosystem, but expecting it to fix workflows that span multiple disconnected systems is where things start to fall apart.