|
By David Iparraguirre
Providing rich context for monitor alerts is an essential part of any robust, scalable monitoring strategy. Alerts that send teams scrambling for basic background information prolong troubleshooting, hindering effective incident response and heightening the potential for service disruption. Given the increasing complexity of modern, distributed applications, however, breaking down knowledge silos in order to ensure consistent access to critical context for alerts can be a challenge.
|
By Bowen Chen
The rapidly growing interest in AI has raised a corresponding demand for specialized cloud compute that is built to run training and inference workloads in a cost-efficient and performant manner. Google Cloud Tensor Processing Units (TPUs) have become a popular accelerated compute solution for AI/ML workloads.
|
By Chris Laverdiere
Data build tool (dbt) is an open source service that cleans, aggregates, and models raw data into organized, analytics-ready formats within a data warehouse. dbt Cloud, a fully managed platform by dbt Labs, extends dbt’s capabilities with advanced features such as scheduling, testing, and monitoring, accessible directly from your browser.
|
By Datadog
On the January episode of This Month in Datadog, join Jeremy Garcia (VP of Technical Community and Open Source) and Daljeet Sandu (Product Manager) for a bonus video that spotlights Datadog On-Call, which is now generally available. Also featured is a roundup of new features that Datadog recently announced. This Month in Datadog is a monthly update of the company’s latest features, product announcements, and more. Subscribe to our YouTube channel to get notifications about future episodes.
|
By Natasha Goel
Cloud unit economics measures the amount an organization spends on cloud services to achieve a discrete business outcome such as a conversion, sign-up, or checkout. Your cloud spending may increase as your applications get more usage and the complexity of your cloud environment grows.
|
By Aaron Weber
In modern application development, changes happen constantly: Deployments are pushed, feature flags are toggled, and Kubernetes events reshape infrastructure, to name just a few. While these practices drive innovation and scalability, they also introduce complexity—especially during incidents. Fragmented tools and workflows across teams and organizations make it difficult to pinpoint the root causes of issues, leading to longer resolution times.
|
By Juliano Costa
Rust’s strong memory safety and efficient code execution make it a top choice for building robust, high-performance systems. But even with its powerful guarantees around memory management and thread safety, Rust applications in production environments can still face challenges such as latency spikes, resource contention, and unexpected bottlenecks. For this reason, monitoring Rust applications is essential to ensure they meet performance expectations and remain reliable under load.
|
By Hugo Pucéat
Even with the best monitoring in place, outages are unavoidable. Complex, modern IT environments rely on multiple third-party services, including critical cloud and API providers, and when any one of those goes down, it can trigger a domino effect of increased error rates and latency spikes across your system. And, because you don’t have as much visibility into external services, it can be difficult to identify that the problem is due to an outside outage or disrupted service.
|
By Brianne Bujnowski
The stress, sudden disruptions, and high stakes of resolving issues while on call is one of the most challenging aspects of an engineer’s job. Many organizations, from startups to large enterprises, still struggle with their on-call experience, which leads to longer resolution times and lower employee retention rates. Constant context switching, managing multiple tools, and racing against time to resolve issues can cause frustration, burnout, and inefficiency.
|
By Casey Culligan
Modern applications rely on databases, making database performance and reliability essential. As systems grow in scale and complexity, identifying the impact and addressing the root causes of database performance issues—such as long query durations or missing indexes—becomes increasingly challenging. Datadog Database Monitoring (DBM) Recommendations address these challenges by providing a clear, prioritized view of performance bottlenecks.
|
By Datadog
By leveraging Datadog’s powerful monitoring tools, custom dashboards, and alerting systems, Telkomsel gained deep visibility into its infrastructure, significantly reducing incidents and improving operational efficiency.
|
By Datadog
Addressing issues and fixing incidents faster than ever was important to SeatGeek, a leading ticketing platform that connects millions of users to live events. Watch how they mastered incident response by integrating Datadog Incident Management.#incident.
|
By Datadog
On This Month in Datadog, we’re bringing you a bonus episode to spotlight Datadog On-Call, which is now generally available, and covering other updates, including the general availability of Code Analysis and our expanded integration with Pinecone.
|
By Datadog
On This Month in Datadog, we’re bringing you a bonus episode to spotlight Datadog On-Call, which is now generally available. On-Call presents pages alongside relevant observability data so teams can quickly respond to incidents. Check out the link in our bio to watch the new episode.
|
By Datadog
Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. To learn more about Datadog and start a free 14-day trial, visit Cloud Monitoring as a Service | Datadog. This month, we put the Spotlight on Datadog On-Call.
|
By Datadog
Welcome to the second video of our new series, Frontend Observability & Monitoring! Datadog Synthetic Monitoring is a proactive monitoring solution that enables you to create code-free API, browser, and mobile tests to automatically simulate end-user workflows and requests on your front-end applications. This video will walk you through setting up browser and api testing capabilities so you can keep tabs on your application uptime and ensure a reliable user experience.
|
By Datadog
Icertis is a leading contract lifecycle management (CLM) platform that empowers organizations to manage their contracts effectively from initiation to renewal. By leveraging advanced AI and analytics, Icertis helps businesses ensure compliance, mitigate risks, and drive better decision-making. The integration of Datadog has tripled the speed of incident detection and resolution, achieving a 20-30% reduction in overall MTTR and saving approximately $500,000 USD through optimized infrastructure scaling at Icertis.
|
By Datadog
As companies rapidly adopt Large Language Models (LLMs), understanding their unique challenges becomes crucial. Join us for a special episode of "Datadog On LLMs: From Chatbots to Autonomous Agents," streaming directly from DASH 2024 on Wednesday, June 26th, to discuss this important topic. In this live session, host Jason Hand will be joined by Othmane Abou-Amal from Datadog’s Data Science team and Conor Branagan from the Bits AI team. Together, they will explore the fascinating world of LLMs and their applications at Datadog.
|
By Datadog
Learn how you can rapidly optimize your Kubernetes clusters.#shorts.
|
By Datadog
On This Month in Datadog, we’re spotlighting Datadog Cloud Cost Management for OpenAI, which enables you to break down costs by project and organization, as well as by individual model and their token consumption.
|
By Datadog
As Docker adoption continues to rise, many organizations have turned to orchestration platforms like ECS and Kubernetes to manage large numbers of ephemeral containers. Thousands of companies use Datadog to monitor millions of containers, which enables us to identify trends in real-world orchestration usage. We're excited to share 8 key findings of our research.
|
By Datadog
The elasticity and nearly infinite scalability of the cloud have transformed IT infrastructure. Modern infrastructure is now made up of constantly changing, often short-lived VMs or containers. This has elevated the need for new methods and new tools for monitoring. In this eBook, we outline an effective framework for monitoring modern infrastructure and applications, however large or dynamic they may be.
|
By Datadog
Where does Docker adoption currently stand and how has it changed? With thousands of companies using Datadog to track their infrastructure, we can see software trends emerging in real time. We're excited to share what we can see about true Docker adoption.
|
By Datadog
Build an effective framework for monitoring AWS infrastructure and applications, however large or dynamic they may be. The elasticity and nearly infinite scalability of the AWS cloud have transformed IT infrastructure. Modern infrastructure is now made up of constantly changing, often short-lived components. This has elevated the need for new methods and new tools for monitoring.
|
By Datadog
Like a car, Elasticsearch was designed to allow you to get up and running quickly, without having to understand all of its inner workings. However, it's only a matter of time before you run into engine trouble here or there. This guide explains how to address five common Elasticsearch challenges.
|
By Datadog
Monitoring Kubernetes requires you to rethink your monitoring strategies, especially if you are used to monitoring traditional hosts such as VMs or physical machines. This guide prepares you to effectively approach Kubernetes monitoring in light of its significant operational differences.
- February 2025 (4)
- January 2025 (29)
- December 2024 (22)
- November 2024 (27)
- October 2024 (15)
- September 2024 (15)
- August 2024 (10)
- July 2024 (15)
- June 2024 (25)
- May 2024 (12)
- April 2024 (19)
- March 2024 (11)
- February 2024 (21)
- January 2024 (19)
- December 2023 (18)
- November 2023 (22)
- October 2023 (15)
- September 2023 (14)
- August 2023 (28)
- July 2023 (15)
- June 2023 (17)
- May 2023 (22)
- April 2023 (13)
- March 2023 (22)
- February 2023 (12)
- January 2023 (8)
- December 2022 (9)
- November 2022 (27)
- October 2022 (22)
- September 2022 (14)
- August 2022 (21)
- July 2022 (13)
- June 2022 (13)
- May 2022 (18)
- April 2022 (14)
- March 2022 (6)
- February 2022 (14)
- January 2022 (17)
- December 2021 (9)
- November 2021 (16)
- October 2021 (26)
- September 2021 (8)
- August 2021 (18)
- July 2021 (15)
- June 2021 (16)
- May 2021 (23)
- April 2021 (20)
- March 2021 (16)
- February 2021 (9)
- January 2021 (10)
- December 2020 (22)
- November 2020 (17)
- October 2020 (12)
- September 2020 (15)
- August 2020 (22)
- July 2020 (20)
- June 2020 (14)
- May 2020 (18)
- April 2020 (24)
- March 2020 (13)
- February 2020 (13)
- January 2020 (11)
- December 2019 (16)
- November 2019 (11)
- October 2019 (11)
- September 2019 (11)
- August 2019 (16)
- July 2019 (18)
- June 2019 (11)
- May 2019 (12)
- April 2019 (20)
- March 2019 (10)
- February 2019 (9)
- January 2019 (6)
- December 2018 (7)
- November 2018 (7)
- October 2018 (13)
- September 2018 (5)
- August 2018 (12)
- July 2018 (12)
- June 2018 (6)
- March 2018 (1)
- December 2017 (1)
- November 2017 (1)
Datadog is the essential monitoring platform for cloud applications. We bring together data from servers, containers, databases, and third-party services to make your stack entirely observable. These capabilities help DevOps teams avoid downtime, resolve performance issues, and ensure customers are getting the best user experience.
See it all in one place:
- See across systems, apps, and services: With turn-key integrations, Datadog seamlessly aggregates metrics and events across the full devops stack.
- Get full visibility into modern applications: Monitor, troubleshoot, and optimize application performance.
- Analyze and explore log data in context: Quickly search, filter, and analyze your logs for troubleshooting and open-ended exploration of your data.
- Build real-time interactive dashboards: More than summary dashboards, Datadog offers all high-resolution metrics and events for manipulation and graphing.
- Get alerted on critical issues: Datadog notifies you of performance problems, whether they affect a single host or a massive cluster.
Modern monitoring & analytics. See inside any stack, any app, at any scale, anywhere.