Operations | Monitoring | ITSM | DevOps | Cloud

Monitor the performance of queues and topics with Azure Service Bus

Azure Service Bus is a fully managed enterprise message broker that enables asynchronous messaging between distributed applications. It is designed to decouple application components, allowing them to communicate reliably, securely, and at scale. With Datadog’s Azure Service Bus integration, you can.

Enrich your existing Datadog telemetry with custom metadata using Reference Tables

As your applications scale and generate more telemetry, it becomes increasingly difficult to sift through the data and analyze it against cost, business functions, and security measures. Logs, events, and other telemetry on their own may not include enough meaningful context or readable details, leading to slower troubleshooting, inefficient business processes, and higher costs.

Remediate Kubernetes incidents faster using private actions in your apps and workflows

The Datadog Action Catalog provides more than 1,400 actions to help you accelerate remediation across your infrastructure directly within Datadog. With actions, you can use Workflow Automation to configure workflows that automatically address issues as they happen and build custom apps in App Builder that empower anyone in your organization to act when incidents occur.

How we structure on-call rotations at Datadog

A well-structured on-call rotation helps you ensure the reliability of your services and meet your customers’ expectations by designating staff to respond to emerging issues. But the pressures of on-call work—such as long shifts, overnight hours, and dynamic situations—can compromise the well-being of your team members. This makes it harder for them to maximize service uptime during their on-call shifts and can limit the velocity of the feature work they do outside of their on-call duty.

How to create an effective paging strategy

Empowered engineers and effective tools are the foundation of incident management, and having a solid on-call process can help facilitate both. In practice, however, many paging approaches have the opposite effect, often overwhelming responders and increasing burnout. To create an effective paging strategy, organizations should focus responder attention on the most important issues and help facilitate a sense of ownership over them.

How state, local, and education organizations can manage logs flexibly and efficiently using Datadog Observability Pipelines

State, local, and education (SLED) organizations need their logs to provide clear, structured insights into system performance, user behavior, and security risks. But often, the picture becomes scattered and chaotic instead, with critical log data buried in noise and gaps that make logs difficult to interpret.

Modernizing Government IT: Observability, Security & Cost Optimization with Datadog

Government IT leaders face the monumental challenge of modernizing aging systems, migrating to the cloud, and enhancing citizen services—all while ensuring security, compliance, and cost efficiency. Siloed tools and limited visibility create roadblocks to achieving these goals. Datadog’s FedRAMP-authorized platform provides full-stack observability, AI-powered security, and cloud cost optimization, helping agencies simplify complexity, strengthen Zero Trust security, and maximize IT budgets.

Best practices for managing Datadog organizations at scale

The adoption of Datadog in large enterprises typically goes beyond integrating metrics, traces, and logs to unify observability. These enterprises must implement and use Datadog in a compliant and standard way across divisions, teams, and projects to enhance data security, comply with regulations, manage costs, and increase operational efficiency.

Datadog On Datadog

At Datadog, over 2,000 engineers deploy and ship new features daily. As a leading observability and security platform used by thousands of companies, ensuring quality and reliability is no small feat. Part of our commitment to excellence lies in our dogfooding culture where our engineering organization is one of the largest and most demanding users of the Datadog platform.

Incident Response: Keeping Cool When Everything's on Fire

The DevOps revolution broke down the traditional silos between development and operations, fundamentally reshaping how we build and maintain software. But with this evolution came an inevitable reality for many engineers: being on-call and responding to incidents. While critical for service reliability, the on-call experience often brings significant stress.

Monitor GitHub Copilot with Datadog

AI-powered coding tools are becoming more commonplace within developer workflows. GitHub Copilot is a popular AI coding assistant that can be integrated directly into IDEs or as a standalone chat interface. This tool helps you write code faster and with less effort by auto-completing code in real time, generating blocks of code from natural language prompts, and answering your questions to help you get over coding hurdles and roadblocks.

This Month in Datadog: Conversations with two Datadog leaders, a sneak peek of DASH 2025, and more

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. This month, we’re joined by Datadog CPO Yanbing Li and SVP of Engineering David Mitchell..

Java on containers: a guide to efficient deployment

Java remains one of the most widely used programming languages today, especially in enterprise backend systems—and for many good reasons. With each new release, Java’s robust runtime offers additional improvements in performance, security, scalability, and developer productivity. The portability of its code has proven increasingly relevant and useful as the industry embraces ARM64, making Java one of the go-to languages for modern workloads.

Monitoring single-page app interactivity with Core Web Vitals and Datadog

Web applications generate a wealth of performance data, but it’s challenging to know exactly which metrics are the most useful for monitoring your user experience. Focusing on irrelevant metrics wastes time and resources—but if you pare down the data you’re observing too much, you may miss critical insights.