The latest News and Information on Observabilty for complex systems and related technologies.
De Watergroep is responsible for the supply of water to more than 3 million customers and hundreds of companies in Belgium. An organisation operating in the public sector, De Watergroep's main goal is to continuously ensure the availability of high-quality drinking water. De Watergroep also is constantly engaged in technological innovation, focusing on keeping distribution costs low, and making maintenance more cost efficient.
Observability data, and especially log data, is immensely valuable for modern business. Making the right decision—from monitoring the bits and bytes of application code to the actions in the security incident response center—requires the right people to generate insights from data as fast as possible.
Starting today, Honeycomb Metrics is now generally available to all Enterprise customers. You’ve adopted our event-based observability practices, in part to overcome the debugging roadblocks you hit when using custom metrics to identify application issues. But metrics do still provide value at the systems level. Now, you can easily see and use your metrics data alongside your event data in Honeycomb—all in one interface.
Open Telemetry represents an effort to combine distributed tracing, metrics and logging into a single set of system components and language-specific libraries. Recently, OpenTelemetry became a CNCF incubating project, but it already enjoys quite a significant community and vendor support. OpenTelemetry defines itself as “an observability framework for cloud-native software”, although it should be able to cover more than what we know as “cloud-native software”.
Although conversation about observability often ignores SREs, SREs have a central role to play in observability success.
There’s no strict definition of a distributed system. But generally speaking, if you have reached a point where you’re running more than five interdependent services at once, that means you’re running a distributed system. It also means you are more than likely experiencing difficulties when troubleshooting using traditional debugging tools. Unfortunately, pulling up multiple tools, each built for a monolithic world, doesn’t help pinpoint the problem.
Systems run into problems all the time. To keep things running smoothly, we need to have an error monitoring and logging system to help us discover and resolve whatever issue that may arise as soon as possible. The bigger the system the more challenging it becomes to monitor it and pinpoint the issue. And with serverless systems with 100s of services running concurrently, monitoring and troubleshooting are even more challenging tasks.