Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on API Development, Management, Monitoring, and related technologies.

Stop Debugging Blindly! How Traffic Capture Can Help Your Code #speedscale #trafficcapture #ai

Is AI "slop" or new code pushing tons of bugs into production? You can't test everything forever. Learn how traffic capture is the most efficient way to understand how your code is actually running in the real world. By grabbing data from sidecars, packet captures, or logs, you get the context you need to prevent bugs and improve performance.

Why do you only use Playwright for pre-release testing and not for production monitoring, too?

We've been running Playwright in production for years. Today, we, at Checkly, are going all in with Playwright Check Suites. Playwright Check Suites is our latest step towards uniting testing and monitoring into a single workflow. It's our biggest advancement yet! Here's why this matters: We're not adapting Playwright anymore. We're running it natively in production with full `playwright.config` support, complete custom dependency control, and support for every tag, spec, or configuration.

Introducing The Next Phase Of Synthetic Monitoring: Playwright Check Suites

We've been running Playwright in production since the beginning. Today, we're going all in. When we first launched Browser Checks with Playwright support, we proved something critical: the most popular test automation framework since Selenium isn't just for testing—it's the foundation of modern production monitoring. But that was just the beginning. Today, we're announcing Playwright Check Suites—our bet on the future of monitoring and the most significant evolution in Checkly's history.

Mitmproxy vs Proxymock: Replaying Traffic for Realistic API Testing

Replaying traffic is a core tool in your toolbox when you need to reproduce a tricky bug or validate how your app behaves. Traffic replay is especially valuable for testing complex software applications that rely on APIs and microservices, where integration and functionality must be thoroughly validated.

Part 1: Building a Production-Grade Traffic Capture and Replay System

A few years ago I was on call during the Super Bowl. At the time I was working for an observability vendor and one of our customers had an outage caused by a surge in user traffic. But our monitoring system didn’t have enough data to know what went wrong and I sat on a call for 2 hours painfully listening to them spinning up more servers and trying to catch up with the user load.

Debugging Without a Net: The Pain of Reproducing Production Issues

Every engineer has been there — a late-night page, a broken feature in production, and no clear way to reproduce it. The logs are vague. The metrics look normal. Your local environment works fine. Yet something somewhere is failing for real users. So begins the detective work — debugging a live system with almost no tools, no perfect test data, and no clone of production.

Ingest OTLP metrics directly into Datadog with the new OTLP Metrics API

Many organizations rely on OpenTelemetry (OTel) to standardize observability across distributed systems. These organizations are at varying stages of adoption and are implementing OTel in complex environments with diverse configurations. To support this range of use cases, Datadog offers many ways to use OpenTelemetry with Datadog.