Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Authors' Cut-Not-So-Distant Early Warning: Making the Move to Observability-Driven Development

This is how the developer story used to go: You do your coding work once, then you ship it to production—only to find out the code (or its dependencies) has security or other vulnerabilities. So, you go back and repeat your work to fix all those issues. But what if that all changed? What if observability were applied before everything was on fire? After all, observability is about understanding systems, which means more than just production.

An Engineer's Bill of Rights and Responsibilities

Power has a way of flowing towards people managers over time, no matter how many times you repeat “management is not a promotion, it’s a career change.” It’s natural, like water flowing downhill. Managers are privy to performance reviews and other personal information that they need to do their jobs, and they tend to be more practiced communicators.

Tracking Core Web Vitals with Honeycomb and Vercel

Google’s Core Web Vitals (CWVs) are used to rank the performance of mobile sites or pages. It’s easy to see when your CWV scores are low, but it’s not always clear exactly why that’s happening. In Honeycomb’s new guide, Tracking Core Web Vitals with Honeycomb and Vercel, you can learn how to capture, analyze, and debug your real-world CWV performance using a free Honeycomb account.

"Why Are My Tests So Slow?" A List of Likely Suspects, Anti-Patterns, and Unresolved Personal Trauma

“Lead time to deploy” means the interval from when the code gets written to when it’s been deployed to production. It has also been described as “how long it takes you to run CI/CD.” How important is it? It’s nigh-on impossible to have a high-performing team if you have a long lead time, and shortening your lead time makes your team perform better, both directly and indirectly.

Ask Miss O11y: My Manager Won't Let Me Spend Any Time Instrumenting My Code

My organization doesn’t want me spending time on instrumenting my product. What can I do? Thanks for the question! You’ll be relieved to hear that you’re in the majority, and also that there are quick (and easy) steps you can do to prove that instrumenting your code is worthwhile.

Building a Resilient System: Our Journey to Observability at Intercom

At Intercom, we focus on customer experience above all—our service’s availability and performance is our top priority. That requires a strong culture of observability across our teams and systems. As a result, we invest a lot in the reliability of our application. But unpredictable failures are inevitable, and when they happen it’s humans that fix them. We operate a socio-technical system, and its ability to recover when faced with adversity is called resilience.

Authors' Cut-Debugging with the Core Analysis Loop, and What to Build vs Buy

In the old days, the most senior members of an engineering team were the best debuggers. They had built up such an extensive knowledge about their systems that they instinctively knew the right questions to ask and the right places to look. They even wrote detailed runbooks in an attempt to identify and solve every possible issue and possible permutation of an issue.

Datasets, Traces, and Spans-Oh My!

If you've stumbled (or purposefully landed) on this blog post, chances are you are new to—or diving deeper—into the observability space, o11y for short. Suffice it to say, you’re not in Kansas anymore. Honeycomb in a lot of ways can serve as a yellow brick road into o11y, and this article should serve as an introduction into how Honeycomb facilitates implementing o11y into applications and distributed services.