Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Getting At The Good Stuff: How To Sample Traces in Honeycomb

(This is the first post by our new head of Customer Success, Irving.) Sampling is a must for applications at scale; it’s a technique for reducing the burden on your infrastructure and telemetry systems by only keeping data on a statistical sample of requests rather than 100% of requests. Large systems may produce large volumes of similar requests which can be de-duplicated.

Instrumenting Lambda with Traces: A Complete Example in Python

We’re big fans of AWS Lambda at Honeycomb. As you may have read, we recently made some major improvements to our storage engine by leveraging Lambda to process more data in less time. Making a change to a complex system like our storage engine is daunting, but can be made less so with good instrumentation and tracing. For this project, that meant getting instrumentation out of Lambda and into Honeycomb.

Honeycomb SLO Now Generally Available: Success, Defined.

Honeycomb now offers SLOs, aka Service Level Objectives. This is the second in a set of of essays on creating SLOs from first principles. Previously, in this series, we created a derived column to show how a back-end service was doing. That column categorized every incoming event as passing, failing, or irrelevant. We then counted up the column over time to see how many events passed and failed. But we had a problem: we were doing far too much math ourselves.

From "Secondary Storage" To Just "Storage": A Tale of Lambdas, LZ4, and Garbage Collection

When we introduced Secondary Storage two years ago, it was a deliberate compromise between economy and performance. Compared to Honeycomb’s primary NVMe storage attached to dedicated servers, secondary storage let customers keep more data for less money. They could query over longer time ranges, but with a substantial performance penalty; queries which used secondary storage took many times longer to run than those which didn’t.

Why I'm Grateful For Our Observability Community

It’s that season, when we take time to consider what we’re grateful for and extend thanks to those we value and the experiences we treasure. One special aspect of America’s Thanksgiving holiday is the inclusiveness of celebrating across all communities and simply sharing, taking time to enjoy the fruits of the land. Giving thanks in late November can bring some fulfillment, but it should also be a reminder that we need to practice gratitude more regularly.

OpenTelemetry, OpenTracing, OpenCensus: An Introduction and Glossary

There’s been a fair bit of buzz lately about OpenTelemetry, which is the next major version of the OpenTracing and OpenCensus projects. The leadership of those two projects have come together to create OpenTelemetry, which combines the best parts of OpenTracing and OpenCensus to create one open source project to help with your instrumentation needs.

PagerDuty Summit Rocked

We had an excellent time at PagerDuty Summit 2019. The folks we met at the booth and in the hallway track felt particularly kindred to Team Honeybee: we’ve all been on-call, and we all want it to be better. The main themes of the conference revolved around best practices and learnings for finding issues faster, knowing exactly what to do in an incident, enabling on-call to know how to prioritize an alert, and what our community can do to improve on-call life.