Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Monitor OCI spend, AI in DDSQL Editor, OTLP Metrics API, and more | This Month in Datadog

See how you can gain insights into cloud costs by tracking OCI spend and easily comparing instance types in October’s episode of This Month in Datadog. Join us for a spotlight of Cloud Cost Management’s support for Oracle Cloud Infrastructure, and the product’s new feature, Instance Explorer, which enables you to visualize and easily compare the cost and performance of instances across AWS, Azure, and Google Cloud.

AWS & Splunk: Accelerating Innovation Through Partnership

Discover how AWS and Splunk are pushing the boundaries of innovation to empower your security, observability, and cloud transformation journey. This video highlights our joint commitment to driving digital resilience through unified visibility, faster threat detection, and seamless integration across AWS services.

Synthetic Monitoring for GraphQL Endpoints: Beyond the Query

GraphQL isn’t just another API protocol—it’s a new layer of abstraction. It collapsed dozens of REST endpoints into one flexible interface where clients decide what data to fetch and how deep to go. That freedom is a gift for front-end teams and a headache for anyone tasked with reliability. Traditional monitoring doesn’t work here. A REST endpoint can be pinged for uptime.

Grafana Mimir 3.0 release: performance improvements, a new query engine, and more

In 2022, we introduced Grafana Mimir, our open source, horizontally scalable, multi-tenant time series database (TSDB) designed for long-term storage of Prometheus and OpenTelemetry metrics. Over the years, Mimir has become a go-to metrics backend within the open source community, with 30 project maintainers and more than 4.7k GitHub stars.

Stop the guesswork: Troubleshoot with confidence with process monitoring

IT infrastructure is vast, complex, and interdependent. At any point in time, businesses rely on thousands of servers running thousands of processes. Detecting server downtime is fairly easy—but true observability is when you know precisely which processes are working as intended and which are silently contributing to performance degradation. A failed database worker or a memory-leaking background service can silently drain resources until your most critical apps grind to a halt.

Accelerate your Azure integration setup with guided onboarding

Getting started with monitoring for Microsoft Azure environments can be a lengthy and manual process. Many tools require users to create app registrations, assign permissions, and enable log forwarding or telemetry data collection across multiple portals and scripts. These fragmented steps slow down onboarding and introduce opportunities for misconfiguration, making it harder for teams to quickly achieve full visibility.

Understand user experience through network performance with Datadog Synthetic Monitoring

When an application slows down or fails, pinpointing the cause isn’t always simple. Is it a backend regression, a misbehaving API, or a bottleneck somewhere deep in the network? Without full visibility, teams waste precious time troubleshooting across disconnected tools and layers. Datadog Synthetic Monitoring now supports Network Path to help you proactively identify whether user-facing issues stem from your code or from the underlying network.

OTel Updates: Declarative Config - A Steadier Way to Configure OpenTelemetry SDKs

Application configs change over time, often in small ways that are easy to miss. They may start simple — a few environment variables, one exporter, nothing unexpected. As your instrumentation grows, you add rules for filtering health check spans, adjust sampling based on attributes, or introduce environment-specific resource settings. Each change makes sense on its own. But months later, the picture can look different across dev, staging, and production.