|
By Addie Beach
When developers and SREs design application tests, they often prioritize user workflows and API availability. Extending that suite with network tests that match your app’s traffic protocols can reveal whether issues originate in the network or application layer. In this post, we’ll explore how you can design effective network tests using the Transmission Control Protocol (TCP), User Datagram Protocol (UDP), or Internet Control Message Protocol (ICMP), including.
|
By Adam Virani
Organizations often call product managers the CEOs of the product. But PMs know that’s a myth. When a CEO wants a status report, they get one immediately. They don’t need to negotiate for engineering time, reconcile conflicting project priorities, or wait for a data scientist to find a gap in their schedule. For most PMs, simply understanding the state of the product is where growth can stall.
|
By Reva Ranka
Engineering organizations rely heavily on developer feedback to improve internal platforms, tooling, and processes. However, that feedback is often scattered across disconnected systems such as external forms, spreadsheets, chat threads, and documentation tools. Because these systems are separate from operational data, teams struggle to correlate developer sentiment with measurable performance or reliability outcomes.
|
By Eric Metaj
Developers and SREs who rely on Microsoft Azure DevOps often face fragmented workflows when investigating issues or reviewing code quality. Troubleshooting an error can require jumping between observability tools and source code repositories as you manually connect traces, stack frames, and commits. At the same time, security vulnerabilities, misconfigurations, and flaky tests may go undetected until later stages of the software delivery life cycle (SDLC), where they are more costly to fix.
|
By Geoffrey Carlisle
UK organizations are increasingly required to design systems that account for data residency requirements, ensuring that operational data remains within national boundaries. Many teams already run their applications on AWS infrastructure in the UK, but telemetry data can still be processed outside the region, creating gaps in visibility. Datadog’s upcoming UK availability zone solves this by keeping telemetry data in the same region as the workloads that generate it.
|
By David Iparraguirre
As organizations grow, they face increasing difficulty in managing their observability efforts. More teams mean more dashboards, monitors, API keys, pipelines, and custom configurations. Without a centralized view, administrators spend hours chasing down untagged resources, investigating surprise bills, and revoking dormant credentials. Governance becomes a reactive effort to reduce waste and address issues, falling short of its potential to proactively create standards and optimize observability.
|
By Ryan Lucht
Technical teams want to know the newest, most cutting-edge tools they can implement to give themselves a competitive advantage, whether it’s the latest developer framework or modern CI/CD practices that boost velocity. But there’s one tool from all the way back in the 1920s that can improve any organization, no matter its scale: the randomized, controlled trial—or simply put, experiments.
|
By Micah Kim
As organizations continue to heavily invest in AI and build more agentic workflows, their telemetry data volumes can surge quickly, and the associated costs can become unpredictable. To regain control of their data, many AI-forward teams are turning to high-throughput, low-latency pipelines to collect and route data to tools such as OpenTelemetry (OTel) and ClickHouse. But these self-hosted solutions come with drawbacks.
|
By Sarjeel Yusuf
Single Step Instrumentation (SSI) simplifies Datadog Application Performance Monitoring (APM) by automatically discovering and instrumenting services across a host. For many teams, SSI is the ideal starting point because it helps them achieve full visibility with minimal setup. However, as environments grow, teams often want more control over which services get traced. Auxiliary workloads such as batch jobs and cron tasks might not require distributed tracing.
|
By Tom Sobolik
If you’re building LLM-powered applications and agents, you’ve probably asked yourself: “How do I know if my changes actually made things better?” You can tweak prompts, adjust temperature settings, or try different models, but it’s not always easy to validate whether version B’s response is better than version A’s. Most teams fly blind in preproduction and rely on user feedback to see how well their application works in the real world.
|
By Datadog
Join Datadog CPO Yanbing Li and a special guest as they discuss emerging technologies and innovation, how they impact businesses today, and the new opportunities they create for you.
|
By Datadog
You’re told to “go build agents” without clear guidance on what that actually means, how to do it well, or how to know if it is working. You are not a data scientist. You are a software engineer. In this talk, a Datadog AI product leader Shri Subramanian breaks down what changes when you move from building applications to building AI agents, and why familiar approaches like traditional testing and linear delivery fall short. We will explore how agent development shifts the focus from code alone to data, prompts, and evaluation, and why functional reliability matters just as much as operational reliability.
|
By Datadog
Delivering great products to your customers requires a mix of evolution and consistency. To really land with users your product has to be ready to adapt and scale, prioritizing across a mix of customer and business needs. Join experts in reliability, systems engineering, and DevOps as they share real-world examples, true stories of pitfalls, and astounding impact from the experiments they have run. Learn how experienced practitioners handle failure, adapt to scale, and bridge gaps between teams to improve software performance and customer outcomes.
|
By Datadog
When stakeholders push for faster growth (new markets, new features, newly modernized stack) your engineering model has to change too. At FitnessPassport, the shift from offshore waterfall delivery to an in-house team meant rebuilding not just services, but confidence: legacy systems with weak logging and little visibility made it hard to know whether changes were working and impossible to spot issues before users did. In this talk, Director of Engineering Rob Mitchell will share how FitnessPassport adopted Datadog and used structured logs, metrics, and traces to tighten feedback loops.
|
By Datadog
Platform teams often end up as the bottleneck for “small” operational asks: add a new button, wire up a workflow, expose one more cloud capability—each change requiring engineering time, reviews, and releases. In this technical deep dive, engineers from the Department of Government Services (Victoria) share the architecture and open source CDK library behind their “Infrastructure Control Panel”: a modular operational enablement app that lets non-technical users interact safely with cloud resources through strong access controls.
|
By Datadog
Datadog has always been driven by a broader vision of helping teams understand and operate complex systems. In this session, you’ll hear from Yrieix Garnier, VP of Product, and Hugo Kaczmarek, Senior Director of Product, as they share the latest updates across the Datadog product suite and discuss how that vision continues to shape the platform’s evolution and support the next generation of AI-driven applications.
|
By Datadog
Get an insider’s view of Datadog from the people who built it. On a special episode of This Month in Datadog, co-founders Olivier Pomel and Alexis Lê-Quôc sit down for a rare, in-depth look at the challenge that inspired them to build the Datadog platform, what the company is working on today, AI, and more. This Month in Datadog brings you the latest updates on our newest product features, announcements, resources, and events.
|
By Datadog
Every second counts during an incident. In 60 seconds, see how five new Incident Management releases can help you more easily stay up to date and collaborate. Check out these announcements and more on This Month in Datadog.#shorts.
|
By Datadog
Modern distributed systems must simultaneously respect where data must live, where it should live for performance, and where it needs to live for resilience. Data sovereignty and residency requirements increasingly affect technical design decisions, not only in regulated industries, but in any global product that must navigate regional expectations, latency constraints, cost structures, and operational realities.
|
By Datadog
See our latest Episode of This Month in Datadog, for a spotlight of Datadog Data Observability, which enables you to detect data quality and pipeline issues early, as well as remediate those issues with end-to-end lineage. We also cover: This Month in Datadog brings you the latest updates on our newest product features, announcements, resources, and events.
|
By Datadog
The elasticity and nearly infinite scalability of the cloud have transformed IT infrastructure. Modern infrastructure is now made up of constantly changing, often short-lived VMs or containers. This has elevated the need for new methods and new tools for monitoring. In this eBook, we outline an effective framework for monitoring modern infrastructure and applications, however large or dynamic they may be.
|
By Datadog
As Docker adoption continues to rise, many organizations have turned to orchestration platforms like ECS and Kubernetes to manage large numbers of ephemeral containers. Thousands of companies use Datadog to monitor millions of containers, which enables us to identify trends in real-world orchestration usage. We're excited to share 8 key findings of our research.
|
By Datadog
Where does Docker adoption currently stand and how has it changed? With thousands of companies using Datadog to track their infrastructure, we can see software trends emerging in real time. We're excited to share what we can see about true Docker adoption.
|
By Datadog
Build an effective framework for monitoring AWS infrastructure and applications, however large or dynamic they may be. The elasticity and nearly infinite scalability of the AWS cloud have transformed IT infrastructure. Modern infrastructure is now made up of constantly changing, often short-lived components. This has elevated the need for new methods and new tools for monitoring.
|
By Datadog
Like a car, Elasticsearch was designed to allow you to get up and running quickly, without having to understand all of its inner workings. However, it's only a matter of time before you run into engine trouble here or there. This guide explains how to address five common Elasticsearch challenges.
|
By Datadog
Monitoring Kubernetes requires you to rethink your monitoring strategies, especially if you are used to monitoring traditional hosts such as VMs or physical machines. This guide prepares you to effectively approach Kubernetes monitoring in light of its significant operational differences.
- April 2026 (26)
- March 2026 (36)
- February 2026 (20)
- January 2026 (17)
- December 2025 (36)
- November 2025 (33)
- October 2025 (27)
- September 2025 (19)
- August 2025 (24)
- July 2025 (30)
- June 2025 (25)
- May 2025 (20)
- April 2025 (15)
- March 2025 (16)
- February 2025 (16)
- January 2025 (29)
- December 2024 (23)
- November 2024 (28)
- October 2024 (15)
- September 2024 (15)
- August 2024 (10)
- July 2024 (15)
- June 2024 (26)
- May 2024 (12)
- April 2024 (19)
- March 2024 (11)
- February 2024 (21)
- January 2024 (19)
- December 2023 (18)
- November 2023 (22)
- October 2023 (15)
- September 2023 (14)
- August 2023 (28)
- July 2023 (15)
- June 2023 (17)
- May 2023 (22)
- April 2023 (13)
- March 2023 (22)
- February 2023 (12)
- January 2023 (8)
- December 2022 (9)
- November 2022 (27)
- October 2022 (22)
- September 2022 (14)
- August 2022 (22)
- July 2022 (13)
- June 2022 (13)
- May 2022 (18)
- April 2022 (14)
- March 2022 (6)
- February 2022 (14)
- January 2022 (17)
- December 2021 (9)
- November 2021 (16)
- October 2021 (26)
- September 2021 (8)
- August 2021 (18)
- July 2021 (15)
- June 2021 (16)
- May 2021 (23)
- April 2021 (20)
- March 2021 (16)
- February 2021 (9)
- January 2021 (10)
- December 2020 (22)
- November 2020 (17)
- October 2020 (12)
- September 2020 (15)
- August 2020 (22)
- July 2020 (20)
- June 2020 (14)
- May 2020 (18)
- April 2020 (24)
- March 2020 (13)
- February 2020 (13)
- January 2020 (11)
- December 2019 (16)
- November 2019 (11)
- October 2019 (11)
- September 2019 (11)
- August 2019 (16)
- July 2019 (18)
- June 2019 (11)
- May 2019 (12)
- April 2019 (20)
- March 2019 (10)
- February 2019 (9)
- January 2019 (6)
- December 2018 (7)
- November 2018 (7)
- October 2018 (13)
- September 2018 (5)
- August 2018 (12)
- July 2018 (12)
- June 2018 (6)
- March 2018 (1)
- December 2017 (1)
- November 2017 (1)
- March 2015 (1)
Datadog is the essential monitoring platform for cloud applications. We bring together data from servers, containers, databases, and third-party services to make your stack entirely observable. These capabilities help DevOps teams avoid downtime, resolve performance issues, and ensure customers are getting the best user experience.
See it all in one place:
- See across systems, apps, and services: With turn-key integrations, Datadog seamlessly aggregates metrics and events across the full devops stack.
- Get full visibility into modern applications: Monitor, troubleshoot, and optimize application performance.
- Analyze and explore log data in context: Quickly search, filter, and analyze your logs for troubleshooting and open-ended exploration of your data.
- Build real-time interactive dashboards: More than summary dashboards, Datadog offers all high-resolution metrics and events for manipulation and graphing.
- Get alerted on critical issues: Datadog notifies you of performance problems, whether they affect a single host or a massive cluster.
Modern monitoring & analytics. See inside any stack, any app, at any scale, anywhere.