Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

Kubernetes observability: How to enrich logs with GeoIP using the Kubernetes Monitoring Helm Chart

When your Kubernetes app suddenly has traffic spikes in a distant country, it can be difficult to determine why. Let’s say, for example, we have an e-commerce app that started to receive an unusual surge of visitors from Australia — something we never anticipated. We search for answers in our logs, but without geographic context, we don’t have the full insights we need.

Detect hallucinations in your RAG LLM applications with Datadog LLM Observability

Hallucinations occur when a large language model (LLM) confidently generates information that is false or unsupported. These responses can spread misinformation that jeopardizes safety, causes reputational damage, and erodes user trust. Augmented generation techniques, such as retrieval-augmented generation (RAG), aim to reduce hallucinations by providing LLMs with relevant context from verified sources and prompting the LLMs to cite these sources in their responses.

Simplifying Observability: Streamlining Telemetry with a Centralized Pipeline

Modern applications generate a deluge of telemetry data—logs, metrics, and traces—that hold the key to understanding system performance and reliability. However, managing this data effectively is a growing challenge for DevOps teams. Raw telemetry can overwhelm teams with complexity and noise even when collected via robust standards like OpenTelemetry.

How to Choose an APM Solution: 5 Critical Questions for 2025

An APM solution, or Application Performance Monitoring tool, is a software application that helps businesses monitor and manage the performance and availability of software applications. APM tools gather data from systems, servers, databases, APIs, and end-user devices to provide deep insights into the root causes of performance issues. APM solutions have evolved far beyond basic monitoring.

Grafana Campfire - Hiring with AI and more about Grafana MCP (Grafana Community Call - May 2025)

In this Campfire community call, we will talk about the new and the future of AI in the field of Observability space and also discuss about the Grafana MCP server to provide access to your Grafana instance and the surrounding ecosystem. Join me (Usman), Matt Ryer, Carl Bergquist, David Kaltschmidt for this exciting session. Special guests: Sarah Zinger, Cyril Tovena and Ben Sully.

Harnessing Network Observability to Enhance Grid Resilience

Within the utility sector, a lot is changing. Utilities continue to pursue digital transformation, altering the way services are delivered and operations are managed. What hasn’t changed is the criticality of the services provided. These organizations deliver essential resources like natural gas, electricity, and water—services that we as consumers rely upon constantly for our comfort, sustenance, communications, and more.

Inside the Observability Journey: Lessons from CarGurus, Nearform & More

Join us for a dynamic panel from Observability Sessions Boston where leaders from CarGurus, Nearform, and Grafana Labs share their real-world experiences with observability. In this candid discussion, David Frankel (CarGurus) and Joe Szodfridt (Nearform) delve into the challenges of implementing scalable observability practices, moving from centralized models to federated teams, and navigating cloud migration with a focus on performance and cost.

Using the OpenTelemetry Operator to boost your observability

If you’ve ever wrangled sidecars or sprinkled instrumentation code just to get basic trace data, you know the setup overhead isn’t always worth the payoff. But what if it was… just easier? That’s where the OpenTelemetry Operator for Kubernetes steps in… and it plays great with Coralogix out of the box!