Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Chaos engineering + monitoring, part 2: for starters

Oh man, did I get ahead of myself in my last post! I started chatting tools, and I realize now that I really should have been talking more about why I’m using Sensu and Gremlin. But it didn’t occur to me until last year at Monitorama. John Allspaw gave the keynote talk (Taking Human Performance Seriously). While you can watch the talk here, I’ll highlight a few points...

Using Honeycomb to remember to delete a feature flag

Feature flags are great and serve us in so many ways. However, we do not love long-lived feature flags. They lead to more complicated code, and when we inevitably default them to be true for all our users, they lead to unused sections of code. In other words, tech debt. How do we stay on top of this? Find out how Honeycomb’s trigger alerts proactively tell you to go ahead and clean up that feature flag tech debt!

Getting At The Good Stuff: How To Sample Traces in Honeycomb

(This is the first post by our new head of Customer Success, Irving.) Sampling is a must for applications at scale; it’s a technique for reducing the burden on your infrastructure and telemetry systems by only keeping data on a statistical sample of requests rather than 100% of requests. Large systems may produce large volumes of similar requests which can be de-duplicated.

Observability Trends in 2020 and Beyond: Announcing the DevOps Pulse 2019 Results

2020 is here and it looks like it’ll be a truly exciting and impactful year for the DevOps community. As you know, the landscape is changing rapidly, and as a result, new technologies and methodologies are emerging to solve challenges you’re experiencing on the job. Observability is one such concept–and achieving it is a huge challenge for software engineers across the globe.

Instrumenting Lambda with Traces: A Complete Example in Python

We’re big fans of AWS Lambda at Honeycomb. As you may have read, we recently made some major improvements to our storage engine by leveraging Lambda to process more data in less time. Making a change to a complex system like our storage engine is daunting, but can be made less so with good instrumentation and tracing. For this project, that meant getting instrumentation out of Lambda and into Honeycomb.

Honeycomb SLO Now Generally Available: Success, Defined.

Honeycomb now offers SLOs, aka Service Level Objectives. This is the second in a set of of essays on creating SLOs from first principles. Previously, in this series, we created a derived column to show how a back-end service was doing. That column categorized every incoming event as passing, failing, or irrelevant. We then counted up the column over time to see how many events passed and failed. But we had a problem: we were doing far too much math ourselves.

From "Secondary Storage" To Just "Storage": A Tale of Lambdas, LZ4, and Garbage Collection

When we introduced Secondary Storage two years ago, it was a deliberate compromise between economy and performance. Compared to Honeycomb’s primary NVMe storage attached to dedicated servers, secondary storage let customers keep more data for less money. They could query over longer time ranges, but with a substantial performance penalty; queries which used secondary storage took many times longer to run than those which didn’t.