Operations | Monitoring | ITSM | DevOps | Cloud

Store and search logs at petabyte scale in your own infrastructure with Datadog CloudPrem

As AI workloads and cloud-native applications expand, organizations are generating more log data than ever. Each service, container, and model inference produces continuous telemetry that must be stored, secured, and analyzed. As telemetry grows more complex, teams must balance full visibility with new retention and residency needs.

Automating your synthetic test infrastructure with Datadog Synthetic Monitoring and Terraform

Testing ecosystems contain massive amounts of data, including outlined test scenarios, prerequisite configurations, and the tests themselves. As a result, these ecosystems are prone to data sprawl. This makes it difficult to prevent configuration drift and quickly spin up new tests, especially at the frequency needed to support a fast-growing application. Teams can handle these challenges by treating their tests as part of their application infrastructure.

Store and search logs at petabyte scale in your own infrastructure with Datadog BYOC Logs

As AI workloads and cloud-native applications expand, organizations are generating more log data than ever. Each service, container, and model inference produces continuous telemetry that must be stored, secured, and analyzed. As telemetry grows more complex, teams must balance full visibility with new retention and residency needs.

Datadog named Leader in 2025 Gartner Magic Quadrant for Digital Experience Monitoring

We are thrilled to announce that, for the second consecutive year, Datadog has been named a Leader in the 2025 Gartner Magic Quadrant for Digital Experience Monitoring. We believe that this recognition reflects our continued focus on helping customers observe, secure, and act on everything that matters across their technology stack.

Transform and Migrate Logs with Datadog Custom Processor

See how Datadog’s new Custom Processor in Observability Pipelines helps you transform and migrate logs from platforms like Splunk and Sumo Logic with precision and control. This demo walks through real examples of using VRL (Vector Remap Language) to enrich log data, rewrite timestamps, apply quotas, and securely process archives.

Redefining Frontend Observability with Datadog RUM

Discover how Datadog is redefining frontend observability with Real User Monitoring (RUM). In this demo, see how RUM helps teams detect, investigate, and resolve frontend issues that directly impact user experience and business outcomes. With RUM Without Limits, you get full visibility into every user session, giving you an accurate and comprehensive view of your users’ experiences. Monitor performance, track errors, and understand how your application behaves in real time.

Get organized, actionable insights from complex test environments with Datadog Test Suites

Modern teams often run hundreds of synthetic tests across multiple services, environments, and user journeys. While these tests provide deep visibility, managing them as a flat list can quickly become overwhelming, especially as organizations scale and teams specialize.

Datadog Cloud Cost Management: Make cost a key metric for engineers

See how Datadog Cloud Cost Management puts cost and efficiency KPIs directly in front of engineers in their daily workflows. In this short demo, you’ll learn how to: Datadog unifies cost, performance, and business metrics in one platform, so FinOps, engineering, and finance teams can make cost-aware decisions together.

How to bridge speed and quality in experiments through unified data

Metrics are fundamental to experimentation for two reasons: They set the basis for evaluating ideas and interventions, and they can suggest where to look next. As such, many teams collect a wide variety of metrics, from application performance data to revenue trends. However, doing so often means manually knitting together data from multiple sources and formats. Even then, data silos can make it challenging to understand the full impact of experimental changes. In this post, we’ll explore.

Introducing Updog.ai: Real-time provider status from Datadog

When external SaaS providers or cloud services degrade or go down, engineers often find themselves wondering if the issue they're encountering is local or more widespread. The answers they find are usually slow to surface, limited in detail, or entirely dependent on the provider's updates. Vendor-controlled status pages and third-party aggregators don’t provide the timely, independent visibility that's necessary to quickly and accurately identify the root cause of slowdowns.

Optimize HPC jobs and cluster utilization with Datadog

High-performance computing (HPC) environments support some of the most critical workloads in the world—from asset pricing models in financial institutions to molecular simulations in drug discovery. These workloads often span hundreds of thousands of cores, depend on specialized infrastructure such as GPUs, and run for extended periods. As a result, performance and efficiency are critical.

Detect and map third-party outages with Datadog External Provider Status

Modern applications depend on dozens of external cloud platforms, APIs, and SaaS services to function. But when those providers experience issues, engineers often spend valuable time asking a basic question: Is the problem with us or with them? Provider-maintained status pages are often slow to update, leaving teams waiting for confirmation while incidents escalate. This delay wastes valuable time, prolongs investigations, and risks customer trust.

Datadog Cloud Cost Management: Telemetry-driven cost allocation

See why Datadog is a leader in cloud cost allocation. In this demo, learn how Datadog leverages high-resolution observability data to deliver accurate, dynamic cost attribution across clouds and containerized environments. You’ll see how Datadog: Discover how Datadog combines cost, performance, and business context to make cost reporting both accurate and actionable.

A deep dive into Java garbage collectors

Historically, developers have relied on languages like C and C++ for explicit control over memory allocation and deallocation. This approach can yield very low overhead and tight control over performance, but it also increases complexity and risk (e.g., memory leaks, dangling pointers, and double frees). This often results in runtime issues that are difficult to diagnose, which can become a drag on team velocity.

Ingest OTLP metrics directly into Datadog with the new OTLP Metrics API

Many organizations rely on OpenTelemetry (OTel) to standardize observability across distributed systems. These organizations are at varying stages of adoption and are implementing OTel in complex environments with diverse configurations. To support this range of use cases, Datadog offers many ways to use OpenTelemetry with Datadog.

Track, debug, and roll back changes with Version History for Synthetic Monitoring tests

A synthetic test is only useful if you can trust what it’s telling you. When one fails, the reason may not be obvious. Was the application updated? Did the test change? Or both? As more people contribute and refine the same test, it becomes harder to understand what changed or restore a working version. Without clear visibility into those updates, teams can spend more time tracking down the cause of a failure than resolving it.

Monitor logs from Amazon EKS on Fargate with Datadog

Amazon EKS on Fargate is a managed service that reduces the operational overhead of maintaining a Kubernetes cluster by abstracting away the underlying infrastructure. In a serverless Fargate environment, each pod is assigned its own isolated compute resources; there is no direct host-level access.

Optimize Cloud Costs with Datadog Cloud Cost Management

Datadog Cloud Cost Management unifies observability and cost data so engineering and FinOps teams can drive efficiency together. In this demo, see how you can: Allocate cloud costs across AWS, Azure, Google Cloud, OCI, and SaaS providers with precision Empower engineers by surfacing costs in their daily workflows Automate recommendations to accelerate optimization Monitor your daily Datadog costs - at no additional charge.

Manage and optimize your OCI costs with Datadog Cloud Cost Management

Engineering teams need to deliver reliable, secure, and high-performing applications, all while keeping costs under control. But engineers often lack visibility into cloud cost data, relying on finance-driven reports that they receive only after the billing cycle closes. Without daily cost insights alongside observability data, they don’t know until it’s too late that an infrastructure change caused a significant cost increase.

How we use Datadog to get comprehensive, fine-grained visibility into our email delivery system

Visibility into email performance is indispensable to any organization that counts on its ability to reach people through their inboxes, including Datadog. SREs, FinOps, and many other teams rely on email as a critical channel for communications from our platform, including monitor alerts, usage reports, and service account notifications. At Datadog, we depend on the visibility provided by our integrations for Mailgun, SendGrid, and Amazon SES to optimize our email performance and ensure deliverability.

Keep stakeholders informed with Datadog Status Pages

When incidents occur, clear communication can be just as important as fast remediation. Your internal teams need timely updates to stay aligned, and your users want to know what is happening and when they can expect a fix. Without a reliable way to proactively share updates, support teams can get flooded with tickets and customer trust can erode. Datadog Status Pages, now generally available, makes it easy to keep everyone informed through a public or internal web page during outages.

Scaling Datadog observability: 1,000 integrations and counting

Integrations have always been central to the Datadog platform, enabling customers to collect the data they need directly from the technologies they use every day. By unifying signals from infrastructure and applications to security and SaaS applications, teams gain both high-level visibility and the ability to drill into the details that matter the most. With more than 1,000 integrations now available, the Datadog ecosystem continues to expand alongside the platforms our customers rely on.

Monitor Slurm with Datadog

Slurm (Simple Linux Utility for Resource Management) is an open source workload management system used to schedule jobs and manage resources for high-performance computing (HPC) Linux clusters. It ensures that jobs and resources are scheduled fairly and efficiently and is scalable across large clusters, an issue that native Linux process management tools struggle with.