Operations | Monitoring | ITSM | DevOps | Cloud

Observability

The latest News and Information on Observabilty for complex systems and related technologies.

Cloud-native observability from customer to kernel

From its inception as a powerhouse for logging, Elastic Observability has grown into a comprehensive solution for full-stack multi and hybrid-cloud observability. Given the increasing complexity of the cloud-native world, the major challenge for observability is twofold: getting deeper and more frictionless visibility at all levels of applications, services, and infrastructure, and making sense of the overwhelming amount of data that is available.

7 Must-Have Steps for Production Debugging in Any Language

Debugging is an unavoidable part of software development, especially in production. You can often find yourself in “debugging hell,” where an enormous amount of debugging consumes all your time and keeps the project from progressing. According to a report by the University of Cambridge, programmers spend almost 50% of their time debugging. So how can we make production debugging more effective and less time-consuming?

How Cortex can help you get the most out of Datadog

With Datadog’s Dash conference right around the corner, we at Cortex have been thinking a lot about best practices for observability. To get the most out of an application performance monitoring (APM) vendor like Datadog, you want to make sure monitoring and observability are built into launch and production readiness checklists.

Elastic Universal Profiling helps you deliver fast, affordable, and efficient services

So, what is Universal Profiling™? Universal Profiling™ is fast emerging as an important component of observability. A standard feature inside hyperscalers since approximately 2010, the technology is slowly percolating into the wider industry. Universal Profiling™ allows you to see what your code is doing all the time, in production across a wide range of languages and can profile both user-space and kernel-space code.

Feature Focus: September 2022

Another month has come to a close, so I’m back again to take you through what’s new and noteworthy from the month of September. If you missed last month’s blog, this will be a monthly recurring series to keep you posted with the latest and greatest at Honeycomb. There’s a ton to cover, so I’ll dispense with the preamble and dive right in.

The Future of Ops Is Platform Engineering

Two years ago I wrote a piece in The New Stack about the Future of Ops Careers. Towards the end, I wrote: I described the second category as “operations engineering minus the infrastructure,” dedicated to evaluating and assembling a production stack of third-party platform providers, enabling software engineers to self-serve their services and own their own code in production. I said: That second category I was describing now has a name. We call those teams "platform engineering.".

Key Observability Scaling Requirements for Your Next Game Launch: Part III

So far in our series on scaling observability for game launches, we’ve discussed ways to 1) quickly analyze large volumes of telemetry data and, 2) ensure high-quality telemetry data for more effective analysis at lower costs. The best practices in these blogs outline best practices for scaling observability during game launch day – which is necessary to ensure high performance across all infrastructure components – to ensure no lag, no glitches, and no bugs.

Observability and Auto-Remediation

Organizations today are under pressure to stay ahead and maintain IT applications and infrastructure optimally. That means their IT teams are tasked to make sure that functions move along smoothly while minimizing downtime. To keep the lights on, enterprises add whatever domain-specific tools they need. However, these tools are often reactive, and not nearly robust enough to handle complex application topologies.