Operations | Monitoring | ITSM | DevOps | Cloud

Datadog

Accelerating Incident Response With Real-Time Business Data at Wayfair

Like any good e-commerce company, Wayfair collects a significant amount of data to use for business intelligence. Until recently, the majority of this data was crunched off-hours in preparation for business use the next day. We also create a great deal of data about our applications and infrastructure in real time.

Volunteers, Not Conscripts: Fixing Out-Of-Hours On-Call at Intercom

Uptime matters. At Intercom, we believe that keeping our product online and working well at all times is critical to the success of our business. Out-of-hours on-call is inherently disruptive to your life as an engineer. You need to be ready to respond quickly and competently to an alert about something being broken.

Node.js monitoring with Datadog APM and distributed tracing

Node.js is an asynchronous JavaScript runtime that is used to develop highly scalable network applications. To help provide more visibility into these dynamic environments, we’re pleased to announce that Datadog APM has officially released support for monitoring Node.js applications, which joins our existing support for Java, Ruby, Python and Go.

Watchdog: Auto-detect performance anomalies without setting alerts

With anomaly detection, outlier detection, forecasting, and composite alerting, Datadog enables you to reliably alert the right people at the right time. But what happens when latency starts to increase, or error rates spike, in areas of your application where you haven’t set alerts? That’s what Watchdog is for.

Introducing APM Trace Search & Analytics with infinite cardinality

Distributed tracing provides a detailed view into application performance. Each trace shows you how an individual request was executed in your app: which user did what, which services were involved, how long it took, and whether the request executed successfully. Capturing that level of detail across hundreds or thousands of services provides a vast trove of information for troubleshooting and performance optimization, but it’s not always easy to find the exact trace events you need.