Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

The CoPE and Other Teams, Part 1: Introduction & Auto-Instrumentation

The CoPE is made to affect, meaning change, how things work. The disruption it produces is a feature, not a bug. That disruption pushes things away from a locally optimal, comfortable state that generates diminishing returns. It sets things on a course of exploration to find new terrains which may benefit it more—and for longer.

Making The Case for Continuous Observability

Software complexity grows exponentially, developer efficiency grows far slower. And debugging often takes up 20-50% of development time. More complex, connected systems means increased data flow at the edge, and in the cloud. That leads to increased exposure to vulnerabilities, cyber threats, malfunctions, and bugs with risks that are hard to assess.

Transform and enrich your logs with Datadog Observability Pipelines

Today’s distributed IT infrastructure consists of many services, systems, and applications, each generating logs in different formats. These logs contain layers of important information used for data analytics, security monitoring, and application debugging. However, extracting valuable insights from raw logs is complex, requiring teams to first transform the logs into a well-known format for easier search and analysis.

Sustainable Computing in Observability with Kunal Nawale

Kunal Nawale, Founder and CEO of SigLens, presents on sustainable computing and observability. Understand the significant energy impact of data centers and how efficient observability can reduce both costs and carbon emissions. Learn about data storage optimization and how SigLens’s open-source solution offers a 90% cost reduction compared to traditional systems like Splunk and Elastic Search.

Get granular LLM observability by instrumenting your LLM chains

The proliferation of managed LLM services like OpenAI, Amazon Bedrock, and Anthropic have introduced a wealth of possibilities for generative AI applications. Application engineers are increasingly creating chain-based architectures and using prompt engineering techniques to build LLM applications for their specific use cases.

Destroy on Friday: The Big Day A Chaos Engineering Experiment - Part 2

In my last blog post, I explained why we decided to destroy one third of our infrastructure in production just to see what would happen. This is part two, where I go over the big day. How did our chaos engineering experiment go? Find out below!

Streamlining Debugging with Lightrun Snapshots: A Superior Alternative to Trace Logging

According to a recent study, failing tests alone cost the enterprise software market an astonishing $61 billion annually. This figure mirrors the vast number of resources devoted to rectifying software failures, translating into about 620 million developer hours lost each year. On average, engineers spend 13 hours to resolve a single software failure, a statistic that paints a stark picture of the current state of debugging efficiency.

Cribl's Blueprint for Secure Software Development.

What does it take to build software for the most security-demanding customers worldwide? At Cribl, building secure products is integral to our engineering identity. We have established a secure software development lifecycle that is both culturally and policy-driven, integrating product security tooling and processes into every architecture review, pull request, and release, whether major or minor.