Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Best practices to prevent alert fatigue

As your environment changes, new trends can quickly make your existing monitoring less accurate. At the same time, building alerts after every new incident can turn a straightforward strategy into a convoluted one. Treating monitoring as a one-time or reactive effort can both result in alert fatigue. Alert fatigue occurs when an excessive number of alerts are generated by monitoring systems or when alerts are irrelevant or unhelpful, leading to a diminished ability to see critical issues.

Identify and resolve incidents faster with InsightFinder's offering in the Datadog Marketplace

InsightFinder is a SaaS platform that uses AI-backed predictive analytics to predict and prevent production incidents. Using InsightFinder with Datadog, you can quickly identify hidden correlations in your application metrics, logs, and events and address application issues before they devolve into production outages and create customer impact.

Best practices for continuous testing with Datadog

In Parts 1 and 2, we looked at how you can build and maintain effective test suites. These steps are a key part of ensuring that application workflows function as expected. But how you run your tests is another important point to consider, so in this post, we’ll walk through best practices for executing your tests across every stage of development. Along the way, we’ll also look at how Datadog supports these practices for the applications that you are already monitoring.

Use HiveMQ and OpenTelemetry to monitor IoT applications in Datadog

Large IoT environments are highly complex and comprise multiple layers of disparate devices that must move data between each other, across potentially unreliable connections. Having visibility into each layer of your IoT environment is critical for quickly identifying problems with your deployment that could negatively impact user experience.

Configure pipeline alerts with Datadog CI monitors

CI pipelines have become an integral part of the development workflow, helping teams automate the continuous building and testing of new updates to application code. The growing importance of CI pipelines has naturally led to a need for increased visibility into their performance. In 2021, Datadog introduced CI Visibility to deliver granular performance metrics for each individual pipeline, allowing you to monitor build duration and related telemetry across all recent commits.

Highlights from AWS re:Invent 2022

Just like shopping on Black Friday, AWS re:Invent has become a post-Thanksgiving tradition for some of us at Datadog. We were excited to join tens of thousands of fellow AWS users and partners for this annual gathering that features new product announcements, technical sessions, networking, and fun. This year, we saw three themes emerge from the conference announcements and sessions.

Golden signals in seconds with Universal Service Monitoring

Whether you are a site reliability engineer, DevOps engineer, or application developer, you need visibility into the health and performance of every service you run or support. But in complex, dynamic environments, it can be difficult to ensure that all services are accounted for.

Monitor your mobile apps with Embrace's offering in the Datadog Marketplace

Embrace is a mobile application monitoring solution that helps you track and troubleshoot mobile app performance by combining data analytics, real user monitoring, network performance monitoring, and hardware monitoring in a single platform. We’re pleased to partner with Embrace to offer an out-of-the-box Embrace Datadog app and software license in the Datadog Marketplace.

Announcing TISAX-compliant observability for the automotive industry and its suppliers

Many organizations face complex regulatory requirements when it comes to monitoring the health and performance of their service and application infrastructure. As part of our ongoing commitment to providing a comprehensive monitoring solution for all customers, we’re pleased to announce that Datadog has achieved TISAX Assessment Level 2 (AL2) certification.

Improve your EC2 rightsizing recommendations with Datadog and AWS Compute Optimizer

While cloud solutions can give you greater flexibility as you scale your infrastructure, limited visibility into resource utilization makes provisioning the right amount of compute resources challenging. To ensure that every workload is fully supported, many organizations may opt to over-provision, which leads to overspending. Or, in an attempt to maximize cost savings, organizations may under-provision, leaving workloads unsupported and risking serious performance impacts.