Operations | Monitoring | ITSM | DevOps | Cloud

Datadog

Generate span-based metrics to track historical trends in application performance

Tracing has become essential for monitoring today’s increasingly distributed architectures. But complex production applications produce an extremely high volume of traces, which are prohibitively expensive to store and nearly impossible to sift through in time-sensitive situations. Most traditional tracing solutions address these operational challenges by making sampling decisions before a request even begins its path through your system (i.e., head-based sampling).

Run UDP and WebSocket API tests to monitor latency-critical applications

Datadog Synthetic Monitoring allows you to proactively monitor your applications so that you can detect, troubleshoot, and resolve any availability or performance issues before they impact your end users. With our API test suite, you can send simulated HTTP requests to your API endpoints, check the validity of SSL certificates, verify the performance and correctness of DNS resolutions, test TCP connections, and ping endpoints to detect server connectivity issues.

Monitor Azure Government with Datadog

Azure Government is a dedicated cloud for public sector organizations that want to leverage Azure’s suite of services in their highly regulated environments. As these organizations migrate their applications to Azure Government, they need to ensure that they can maintain visibility into the status and health of their entire infrastructure.

Datadog acquires Ozcode

At Datadog, we believe that having visibility into production is crucial to building better software, especially as modern environments become more and more complex. Bugs that occur in production are often difficult to reproduce locally, which leaves developers guessing about what went wrong. To solve this problem, teams need the same depth of visibility into their production environments as they do into their local environments.

Build a modern data compliance strategy with Datadog's Sensitive Data Scanner

Within distributed applications, data moves across many loosely connected endpoints, microservices, and teams, making it difficult to know when services are storing—or inadvertently leaking—sensitive data. This is especially true for governance, risk management, and compliance (GRC) or other security teams working for enterprises in highly regulated industries, such as healthcare, banking, insurance, and financial services.

Datadog on Building Responsive UX

Datadog product designers and frontend developers have been working together to create a new, better UX for creating dashboards, which is one of the most important parts of using Datadog. A central part of this effort was building a new layout engine. Working on this project was a bit different from the usual feature work, so the collaboration cycle between our developers and designers had to change for us to more closely and quickly design, build, and test constraints and new ideas in the browser.

Dynamically control your custom metrics volume with Metrics without Limits

Sending custom metrics to Datadog allows you to monitor important data specific to your business and applications, such as latency, dollars per customer, items bought, or trips taken. And tags are key to being able to slice and dice these custom metrics to quickly find the information you need. But collecting enough custom metrics to have complete visibility can be cost prohibitive. For example, you might run microservices instrumented across thousands of containers.

Dash 2021 Keynote

The Datadog team deliver the annual Dash keynote. At Dash 2021, we announced new products and features that give your team even greater visibility into the health and performance of your code, databases, CI/CD pipelines, and more. Now, you can monitor network devices, get visibility into your services' golden signal metrics without touching a single line of code, and integrate third-party tools into our platform with Datadog Apps. We expanded RUM to include iOS error tracking, Session Replay, and Watchdog Insights. And we introduced Datadog Observability Pipelines, which run on your infrastructure and put you in control of your observability data, from how it’s processed to where it’s sent.

Panel: Improving Monitoring & Reliability with Chaos Engineering - Dash 2021 (Datadog,Gremlin,Pismo)

Monitoring and observability are critical for knowing how your systems are behaving, but how do you create the feedback loops to shift from reactive monitoring for incidents to proactively preventing them? In this roundtable discussion Mauricio Galdieri, Software Architect at Pismo.io and Kolton Andrus, CEO and co-founder of Gremlin join Tay Nishimura, Site Reliability Engineer on the Chaos Engineering team at Datadog to chat about monitoring, Chaos Engineering, and using them together to build more reliable systems.

Scaling HashiCorp's Cloud Platform - Dash 2021 (HashiCorp)

Identifying bottlenecks during times of high load is critical to building a scalable software platform. Stress testing is one way to simulate high load on a system and allows you to proactively capture potential bottlenecks before they impact customers. Once a solution is implemented to address the bottleneck, you need a way to measure success and find a new limit. See how HashiCorp Cloud Platform (HCP) has developed a stress testing framework which heavily relies on Datadog’s custom metric capabilities in combination with some out of the box integrations to give HCP engineers a comprehensive view of their platform and how they used these insights to scale their concurrent data-plane provisioning by 300%.