Monthly Archive

This Month in Datadog - July 2025

Jul 31, 2025 By Datadog In Datadog

In July’s episode of This Month in Datadog, we’re doing things differently by spotlighting the people behind the products you rely on. Jeremy is joined by Tristan Ratchford to discuss saving time and effort when you’re on call with Bits AI SRE, and by Kevin Hu to explore gaining visibility into datasets across the entire data lifecycle with Data Observability.

Read Post

Datadog

Read more about This Month in Datadog - July 2025

Datadog Disaster Recovery mitigates cloud provider outages

Jul 30, 2025 By Michael Richey In Datadog

A loss in infrastructure and applications observability can leave SRE and DevOps teams without insight into the real-time state of their production systems, causing them to temporarily pause code deployments and limit their ability to troubleshoot issues or respond to critical alerts. In modern cloud environments, where services are distributed and deeply interconnected, this lack of visibility can escalate quickly.

Read Post

Datadog

Read more about Datadog Disaster Recovery mitigates cloud provider outages

Bring high-performance observability to secure Kubernetes environments with Datadog's new CSI driver

Jul 30, 2025 By Adel Haj Hassan In Datadog

In Kubernetes environments, applications often communicate with the Datadog Agent to send telemetry data such as custom metrics via DogStatsD or traces through Datadog APM. How this communication takes place depends on the communication mode set on the Datadog Cluster Agent's Admission Controller. With the sockets option, communication takes place through local inter-process communication via Unix domain sockets (UDS), whereas the service and default hostip options rely on network communication.

Read Post

Datadog

Read more about Bring high-performance observability to secure Kubernetes environments with Datadog's new CSI driver

This Month in Datadog: Bits AI SRE, Datadog Data Observability, and more

Jul 30, 2025 By Datadog In Datadog

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. To learn more about Datadog and start a free 14-day trial, visit Cloud Monitoring as a Service | Datadog. This month, we chat with two guests about Bits AI SRE and Datadog Data Observability.

View Video

Datadog

Read more about This Month in Datadog: Bits AI SRE, Datadog Data Observability, and more

AI Agents Console: Monitor the behavior and interactions of any AI agent in your stack

Jul 29, 2025 By Datadog In Datadog

With Datadog's AI Agents Console, you can monitor the behavior and interactions of any AI agent that’s a part of your enterprise stack, whether that’s a computer use agent like OpenAI’s Operator, IDE agent like Cursor, DevOps agent like Github Copilot, enterprise business agent like Agentforce, or your internally built agents. You'll have full visibility into every agent's actions, insights into the security and performance of your agents, analytics on user engagement, and measurable business value from every agent, all in a centralized location.

View Video

Datadog

Read more about AI Agents Console: Monitor the behavior and interactions of any AI agent in your stack

New in APM

Jul 29, 2025 By Datadog In Datadog

Datadog’s Latency Investigator for APM—now in Preview—automatically investigates hypotheses in the background, comparing historical traces and correlating change tracking, DBM, and profiling signals. This helps teams quickly isolate root causes and understand impact without combing through raw telemetry data. You can go from detection to resolution in a single workflow, and generate a pull request to apply a recommended fix, all without leaving Datadog..

View Video

Datadog

Read more about New in APM

Data Observability: Build confidence in the data life cycle

Jul 29, 2025 By Datadog In Datadog

Datadog Data Observability provides a complete solution with quality checks (e.g., volume, row changes, freshness), custom SQL-based monitors, anomaly detection, column-level lineage across systems like Snowflake and Tableau, full pipeline visibility, and targeted alerts when data issues arise.

View Video

Datadog

Read more about Data Observability: Build confidence in the data life cycle

Why continuous profiling is the fourth pillar of observability

Jul 25, 2025 By Marcus Hirt In Datadog

Developers have long used profilers to diagnose performance bottlenecks and improve the efficiency of their code. But a modern version of profiling, continuous profiling, is quietly redefining what profiling is and what it can do. By running nonstop in production with very low overhead, continuous profilers give teams always-on visibility into how their code behaves in the real world.

Read Post

Datadog

Read more about Why continuous profiling is the fourth pillar of observability

Datadog IDP: Ship software quickly and confidently

Jul 24, 2025 By Datadog In Datadog

Datadog Internal Developer Portal (IDP) helps developers quickly track down shared engineering knowledge, execute common production tasks in self-service manner, and evaluate the production-readiness of new service code.

View Video

Datadog

Read more about Datadog IDP: Ship software quickly and confidently

Datadog Log Management: Analyze complex data sets

Jul 24, 2025 By Datadog In Datadog

Datadog Sheets provides a spreadsheet-style interface for analyzing your telemetry data — you can perform lookups, build pivot tables, and create calculated columns using familiar spreadsheet functionality. This enables teams to join datasets, aggregate results, and explore trends without writing code.

View Video

Datadog

Read more about Datadog Log Management: Analyze complex data sets

Debug live production issues with the Datadog Cursor extension

Jul 24, 2025 By Datadog In Datadog

The Datadog Cursor Extension uses the Datadog remote MCP Server to give developers access to Datadog tools and observability data directly from within the Cursor IDE. The Cursor Extension enables you to view live variable values that your logpoints capture during execution, and you can use the Cursor Agent to identify the lines of code responsible for the issue at hand. The Datadog Cursor Extension is now available in Preview.

View Video

Datadog

Read more about Debug live production issues with the Datadog Cursor extension

How Datadog Cloud Network Monitoring helps you move to a deny-by-default network egress policy at scale

Jul 23, 2025 By Lee Avital In Datadog

When organizations first begin deploying workloads on Kubernetes, it's common for them to start with a permissive egress traffic policy that allows any workload to reach the internet. This approach can make it easier for teams to stay agile and to get services up and running in fast-moving environments. But as your Kubernetes footprint grows, it's important to minimize public internet access on a per-workload basis to improve your organization's security posture.

Read Post

Datadog

Read more about How Datadog Cloud Network Monitoring helps you move to a deny-by-default network egress policy at scale

Bits AI Dev Agent: Automatically identify issues and generate code fixes

Jul 23, 2025 By Datadog In Datadog

The Bits Dev Agent is an AI-powered coding assistant in Datadog designed to reclaim developer productivity by autonomously monitoring telemetry data, identifying key issues, and generating production-ready pull requests. Developers receive asynchronous, context-rich PRs with clear explanations, allowing them to shift their focus from troubleshooting to reviewing solutions and building better code.

View Video

Datadog

Read more about Bits AI Dev Agent: Automatically identify issues and generate code fixes

Introducing Bits AI SRE, your AI on-call teammate

Jul 23, 2025 By Datadog In Datadog

Bits AI SRE is your AI on-call teammate, built to autonomously investigate alerts and coordinate incident response. Integrated with Datadog, Slack, GitHub, Confluence, and more, Bits analyzes telemetry, reads documentation, and reviews recent deployments to determine the root cause of alerts—often before you’ve even opened your laptop. In fact, if you're using Datadog On-Call, you can view Bits’s findings right from your phone—so you’re always one step ahead, no matter where you are.

View Video

Datadog

Read more about Introducing Bits AI SRE, your AI on-call teammate

Datadog Incident Response: Unify remediation and communication

Jul 23, 2025 By Datadog In Datadog

With Datadog's new AI voice agent in Incident Response, you can quickly get up to speed on the issue and start taking action directly from your phone. Handoff notifications make it easy to jump straight to the relevant context and quickly communicate with other responders. Finally, our status pages enable you to automatically update users on your remediation progress.

View Video

Datadog

Read more about Datadog Incident Response: Unify remediation and communication

Monitor Lambda-hosted web apps with the Lambda Web Adapter integration

Jul 17, 2025 By Jordan Obey In Datadog

As organizations migrate their legacy web applications from containerized or server-based deployments to serverless environments, they often run into a critical compatibility challenge. Traditional web frameworks like Flask, Express, or SpringBoot are designed to run on persistent HTTP servers, not event-driven, stateless environments like AWS Lambda. The AWS Lambda Web Adapter bridges this gap by allowing teams to run web server-based applications inside Lambda with minimal changes.

Read Post

Datadog

Read more about Monitor Lambda-hosted web apps with the Lambda Web Adapter integration

Choosing the right OpenTelemetry Collector distribution

Jul 16, 2025 By Juliano Costa In Datadog

The OpenTelemetry (OTel) Collector plays a central role in collecting, processing, and exporting telemetry data. If you’re deploying the Collector in production, chances are you’ve reached for the otelcol-contrib distribution. It’s the easiest, most flexible, and most documented distribution, used in nearly every demo and getting-started guide. But here’s the catch: It’s not actually recommended for production use.

Read Post

Datadog

Read more about Choosing the right OpenTelemetry Collector distribution

Missing container-layer metadata: Why it happens and what you can do

Jul 16, 2025 By Stephanie Wei In Datadog

Container image layers provide valuable insight into what goes into a container, including which packages were installed, what commands were run, and where vulnerabilities might live. The metadata associated with these image layers is essential for debugging, optimizing image size, and managing security risks. However, key container-layer metadata fields such as digest, size, and created_by are sometimes missing, which can disrupt important tasks.

Read Post

Datadog

Read more about Missing container-layer metadata: Why it happens and what you can do

A look back at DASH 2025

Jul 15, 2025 By Claire Laurence In Datadog

DASH 2025 brought the Datadog community together like never before. During our biggest event yet, thousands of attendees gathered at the North Javits Center in New York City for two and a half days of content, learning, and community, where they deepened their knowledge and connected with peers. Here's a quick look back at some of the highlights from this year's DASH.

Read Post

Datadog

Read more about A look back at DASH 2025

Proactively troubleshoot with synthetic testing and distributed tracing

Jul 15, 2025 By Addie Beach In Datadog

As your application grows in complexity, identifying the root cause of issues becomes increasingly difficult. Many monitoring strategies make this even harder by siloing frontend and backend data. To effectively troubleshoot problems that spread across your app, you need visibility not just into each part of your stack, but also into how these parts interact.

Read Post

Datadog

Read more about Proactively troubleshoot with synthetic testing and distributed tracing

Monitor agents built on Amazon Bedrock with Datadog LLM Observability

Jul 15, 2025 By Barry Eom In Datadog

As large language models (LLMs) grow more powerful, organizations are deploying agentic AI applications to tackle complex, multi-step tasks. With Amazon Bedrock Agents, developers can orchestrate these agents to manage tasks such as triggering serverless functions, calling APIs, accessing knowledge bases, and maintaining contextual conversations—all while breaking down complex user requests or tasks into manageable steps.

Read Post

Datadog

Read more about Monitor agents built on Amazon Bedrock with Datadog LLM Observability

Beyond Metrics: How We Reimagined Incident Response with RUM

Jul 10, 2025 By Datadog In Datadog

When your monitoring tools and logs tell you everything's fine, but users can't access critical healthcare services, where do you look? Our team discovered that Real User Monitoring (RUM) isn't just for tracking page load times and user journeys – it's a powerful incident response tool that can uncover issues traditional monitoring misses entirely.

View Video

Datadog

Read more about Beyond Metrics: How We Reimagined Incident Response with RUM

Datadog named Leader in 2025 Gartner Magic Quadrant for Observability Platforms

Jul 10, 2025 By Yanbing Li In Datadog

We are thrilled to announce that, for the fifth consecutive year, Datadog has been named a Leader in the 2025 Gartner Magic Quadrant for Observability Platforms. We believe that this recognition reflects our continued focus on helping customers observe, secure, and act on everything that matters across their technology stack.

Read Post

Datadog

Read more about Datadog named Leader in 2025 Gartner Magic Quadrant for Observability Platforms

Here's how to add business data to logs from retail endpoints | Datadog Tips & Tricks

Jul 10, 2025 By Datadog In Datadog

Some sources simply do not generate data-rich logs. Retail endpoints that are older or run on proprietary services, for example, very often produce logs without the kinds of data that are needed to perform useful business analytics. So, what can you do?

View Video

Datadog

Read more about Here's how to add business data to logs from retail endpoints | Datadog Tips & Tricks

Reduce your mean time to repair with the Datadog mobile app

Jul 9, 2025 By Nancy Zhu In Datadog

For on-call engineers responding to alerts, every minute counts. Faster incident response means faster mitigation, reduced downtime, and better customer experience. But even the most finely tuned, meticulously detailed alerts can leave responders scrambling for more information. In order to effectively triage and investigate incidents and set remediation in motion, responders need data to help them contextualize alerts.

Read Post

Datadog

Read more about Reduce your mean time to repair with the Datadog mobile app

Troubleshoot root causes with GitHub commit and ownership data in Error Tracking

Jul 9, 2025 By Tarun Kothandaraman In Datadog

When an error occurs, developers need to act quickly. But too often, they’re left searching through stack traces without enough context to understand what happened, who owns the code, or what change may have introduced the issue. This slows down triage, creates inefficient handoffs, and takes time away from building new features.

Read Post

Datadog

Read more about Troubleshoot root causes with GitHub commit and ownership data in Error Tracking

Monitor your LiteLLM AI proxy with Datadog

Jul 9, 2025 By Barry Eom In Datadog

As organizations rapidly scale their use of large language models (LLMs), many teams are adopting LiteLLM to simplify access to a diverse set of LLM providers and models. LiteLLM provides a unified interface through both an SDK and proxy to speed up development, centralize control, and optimize LLM-powered workflows. But introducing a proxy layer adds abstraction, making it harder to understand how requests are processed.

Read Post

Datadog

Read more about Monitor your LiteLLM AI proxy with Datadog

Understanding data lineage

Jul 9, 2025 By Aaron Kaplan In Datadog

Data lineage is the evolutionary history of datasets. More concretely, lineage is metadata that captures the flow and transformation of data in data pipelines, also called the data lifecycle.

Read Post

Datadog

Read more about Understanding data lineage

How we created a single app to automate repetitive tasks with Datadog Workflow Automation, Datastore, and App Builder

Jul 3, 2025 By Barak Shoushan In Datadog

For many organizations, scaling up their systems means incorporating new tools to build out infrastructure, optimize code performance and security, improve communication, and track cost changes. While these changes are necessary to support an increased workload, they often result in a situation where even the most basic tasks involve switching between multiple platforms.

Read Post

Datadog

Read more about How we created a single app to automate repetitive tasks with Datadog Workflow Automation, Datastore, and App Builder

Why GovRAMP-authorized observability matters for state, local, and education IT teams

Jul 1, 2025 By Greg Reeder In Datadog

Building on our FedRAMP Moderate authorization and our “In Process” status for FedRAMP High, Datadog for Government is now "In Process" for GovRAMP High Authorization, giving agencies a unified observability platform that meets the toughest public-sector security bars.

Read Post

Datadog

Read more about Why GovRAMP-authorized observability matters for state, local, and education IT teams

Operations | Monitoring | ITSM | DevOps | Cloud

This Month in Datadog - July 2025

Datadog Disaster Recovery mitigates cloud provider outages

Bring high-performance observability to secure Kubernetes environments with Datadog's new CSI driver

This Month in Datadog: Bits AI SRE, Datadog Data Observability, and more

AI Agents Console: Monitor the behavior and interactions of any AI agent in your stack

New in APM

Data Observability: Build confidence in the data life cycle

Why continuous profiling is the fourth pillar of observability

Datadog IDP: Ship software quickly and confidently

Datadog Log Management: Analyze complex data sets

Debug live production issues with the Datadog Cursor extension

How Datadog Cloud Network Monitoring helps you move to a deny-by-default network egress policy at scale

Bits AI Dev Agent: Automatically identify issues and generate code fixes

Introducing Bits AI SRE, your AI on-call teammate

Datadog Incident Response: Unify remediation and communication

Monitor Lambda-hosted web apps with the Lambda Web Adapter integration

Choosing the right OpenTelemetry Collector distribution

Missing container-layer metadata: Why it happens and what you can do

A look back at DASH 2025

Proactively troubleshoot with synthetic testing and distributed tracing

Monitor agents built on Amazon Bedrock with Datadog LLM Observability

Beyond Metrics: How We Reimagined Incident Response with RUM

Datadog named Leader in 2025 Gartner Magic Quadrant for Observability Platforms

Here's how to add business data to logs from retail endpoints | Datadog Tips & Tricks

Reduce your mean time to repair with the Datadog mobile app

Troubleshoot root causes with GitHub commit and ownership data in Error Tracking

Monitor your LiteLLM AI proxy with Datadog

Understanding data lineage

How we created a single app to automate repetitive tasks with Datadog Workflow Automation, Datastore, and App Builder

Why GovRAMP-authorized observability matters for state, local, and education IT teams

Monthly Archive

Follow Us