Latest Posts

Aligning Business and Engineering Goals with Honeycomb SLOs

Aug 19, 2024 By Priscilla Lam In Honeycomb

Setting clear, measurable goals is essential for any successful team. However, aligning those goals with the technical work can be challenging in the fast-paced world of software engineering. Engineers might focus on reducing latency or improving uptime, while business leaders look at revenue and customer satisfaction. It gets tricky to track the impact between the two to justify when specific engineering initiatives are important, why, and how they impact the bottom line.

Read Post

Honeycomb

Read more about Aligning Business and Engineering Goals with Honeycomb SLOs

A CoPE's Guide to Alert Management

Aug 15, 2024 By Nick Travaglini In Honeycomb

Alerts are a perennial topic, and a CoPE will need to engage with them. The bounds of this problem space are formed by two types of alerts: Understanding what these alerts are and how to configure them is one thing. Thinking about what they each do for your organization, and how using one or the other affects things, is another. The latter will be the focus of this article.

Read Post

Honeycomb

Read more about A CoPE's Guide to Alert Management

The CoPE and Other Teams, Part 2: Custom Instrumentation and Telemetry Pipelines

Aug 8, 2024 By Nick Travaglini In Honeycomb

The previous post laid out the basic idea of instrumentation and how OpenTelemetry’s auto-instrumentation can get teams started. However, you can’t rely only on auto-instrumentation. This post will discuss the limitations in more detail and how a CoPE can help teams overcome them.

Read Post

Honeycomb

Read more about The CoPE and Other Teams, Part 2: Custom Instrumentation and Telemetry Pipelines

Deploying the OpenTelemetry Collector to AKS

Aug 7, 2024 By Martin Thwaites In Honeycomb

While investigating some issues users raised around the OpenTelemetry Collector running in AKS, I found a few nuances that are worth noting. In this article, I'll go over some changes you have to implement in your values.yaml to make it work for you.

Read Post

Honeycomb

Read more about Deploying the OpenTelemetry Collector to AKS

Apdex in Honeycomb

Aug 5, 2024 By Max Aguirre In Honeycomb

“How is my app performing?” is one of the most common, yet hardest questions to answer. There are myriad ways to measure this, like error rate, average response time, and so on. Enter the Application Performance Index (aka Apdex), a single metric that attempts to answer, “Are my application’s users happy?” Apdex is an open standard that was formalized in 2005 by the Apdex Alliance.

Read Post

Honeycomb

Read more about Apdex in Honeycomb

Making Room for Some Lint

Jul 29, 2024 By Fred Hebert In Honeycomb

It’s one of my strongly held beliefs that errors are constructed, not discovered. However we frame an incident’s causes, contributing factors, and context ends up influencing the shape of the corrective items (if any) that get created. I’ll cover these ideas by using our June 3rd incident where a database migration caused a large outage by locking up a shared database and making it run out of connections.

Read Post

Honeycomb

Read more about Making Room for Some Lint

The CoPE and Other Teams, Part 1: Introduction & Auto-Instrumentation

Jul 25, 2024 By Nick Travaglini In Honeycomb

The CoPE is made to affect, meaning change, how things work. The disruption it produces is a feature, not a bug. That disruption pushes things away from a locally optimal, comfortable state that generates diminishing returns. It sets things on a course of exploration to find new terrains which may benefit it more—and for longer.

Read Post

Honeycomb

Read more about The CoPE and Other Teams, Part 1: Introduction & Auto-Instrumentation

Destroy on Friday: The Big Day A Chaos Engineering Experiment - Part 2

Jul 23, 2024 By Lex Neva In Honeycomb

In my last blog post, I explained why we decided to destroy one third of our infrastructure in production just to see what would happen. This is part two, where I go over the big day. How did our chaos engineering experiment go? Find out below!

Read Post

Honeycomb

Read more about Destroy on Friday: The Big Day A Chaos Engineering Experiment - Part 2

What Makes for a 'Good' Pair Programming Session?

Jul 18, 2024 By Ruthie Irvin In Honeycomb

Software changes so rapidly that developing on the cutting edge of it cannot fall to a single person. When it comes to asynchronously disseminating information about projects, code comments, PR conversations, Slack, RFCs, and other investigatory documents do a wonderful job, but no amount of async communication replaces the magic of two brains bouncing ideas off of each other.

Read Post

Honeycomb

Read more about What Makes for a 'Good' Pair Programming Session?

Deploy on Friday? How About Destroy on Friday! A Chaos Engineering Experiment - Part 1

Jul 16, 2024 By Lex Neva In Honeycomb

We recently took a daring step to test and improve the reliability of the Honeycomb service: we abruptly destroyed one third of the infrastructure in our production environment using AWS’s Fault Injection Service. You might be wondering why the heck we did something so drastic. In this post, we’ll go over why we did it and how we made sure that it wouldn’t impact our service.

Read Post

Honeycomb

Read more about Deploy on Friday? How About Destroy on Friday! A Chaos Engineering Experiment - Part 1

Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Aligning Business and Engineering Goals with Honeycomb SLOs

A CoPE's Guide to Alert Management

The CoPE and Other Teams, Part 2: Custom Instrumentation and Telemetry Pipelines

Deploying the OpenTelemetry Collector to AKS

Apdex in Honeycomb

Making Room for Some Lint

The CoPE and Other Teams, Part 1: Introduction & Auto-Instrumentation

Destroy on Friday: The Big Day A Chaos Engineering Experiment - Part 2

What Makes for a 'Good' Pair Programming Session?

Deploy on Friday? How About Destroy on Friday! A Chaos Engineering Experiment - Part 1

Monthly Archive

Follow Us