Latest Videos

Datadog Bits AI SRE: Your new teammate for on-call shifts

Dec 2, 2025 By Datadog In Datadog

Bits AI SRE is an always-on SRE agent built to handle complex troubleshooting and late-night alerts. Developed against thousands of real-world incidents and powered by Datadog’s platform, Bits AI SRE analyzes your entire stack, tests hypotheses, and identifies root causes in minutes. Resolve faster, get back to sleep sooner, and give your on-call team the confidence and capacity they need.

View Video

Datadog

Read more about Datadog Bits AI SRE: Your new teammate for on-call shifts

Introducing Bits AI SRE, your AI on-call teammate

Nov 24, 2025 By Datadog In Datadog

Bits AI SRE is your AI on-call teammate, built to autonomously investigate alerts and coordinate incident response. Integrated with Datadog, Slack, GitHub, Confluence, and more, Bits analyzes telemetry, reads documentation, and reviews recent deployments to determine the root cause of alerts—often before you’ve even opened your laptop. In fact, if you're using Datadog On-Call, you can view Bits’s findings right from your phone—so you’re always one step ahead, no matter where you are.

View Video

Datadog

Read more about Introducing Bits AI SRE, your AI on-call teammate

Data Observability: Build confidence in the data life cycle

Nov 21, 2025 By Datadog In Datadog

Datadog Data Observability provides a complete solution with quality checks (e.g., volume, row changes, freshness), custom SQL-based monitors, anomaly detection, column-level lineage across systems like Snowflake and Tableau, full pipeline visibility, and targeted alerts when data issues arise.

View Video

Datadog

Read more about Data Observability: Build confidence in the data life cycle

Explore Cloud Instance Pricing and Performance with Datadog Instance Explorer

Nov 19, 2025 By Datadog In Datadog

Meet Datadog Instance Explorer — a way to explore, compare, and monitor cloud instance pricing and performance across AWS, Azure, and Google Cloud in one place. In this quick overview, you’ll learn how to: Start exploring your instance options today and make smarter, data-driven infrastructure decisions.

View Video

Datadog

Read more about Explore Cloud Instance Pricing and Performance with Datadog Instance Explorer

Datadog GPU Monitoring: Optimize and troubleshoot AI infrastructure

Nov 18, 2025 By Datadog In Datadog

With Datadog GPU Monitoring, engineering and ML teams can monitor GPU fleet health across cloud, on-prem, and GPU-as-a-Service platforms like Coreweave and Lambda Labs. Real-time insights into allocation, utilization, and failure patterns make it easy to spot bottlenecks, eliminate idle GPU spend, and resolve provisioning gaps. By tying usage metrics directly to cost and surfacing hardware and networking issues impacting performance, Datadog helps teams make fast, cost-efficient decisions to keep AI workloads running reliably at scale.

View Video

Datadog

Read more about Datadog GPU Monitoring: Optimize and troubleshoot AI infrastructure

Bringing Observability to Data

Nov 14, 2025 By Datadog In Datadog

While observability practices have evolved in recent years, they have largely focused on application services and infrastructure. Yet it is data what powers our applications, businesses, and AI models. When data issues occur, the consequences can be far reaching, from poor product experiences to billing errors to misinformed AI outcomes. In this session, Jonathan Morin, Group Product Manager at Datadog, shares real-world examples of incidents and explains how data observability can address them, helping teams detect issues earlier, reduce costly downtime, and restore trust in their data.

View Video

Datadog

Read more about Bringing Observability to Data

The Hidden Bottleneck in Latency: GetYourGuide's Database Performance Journey

Nov 14, 2025 By Datadog In Datadog

Fast front-end and back-end code alone won’t guarantee low end-to-end latency as hidden bottlenecks in the database can undermine even the best engineering efforts. In this session, Oleksii Serhiienko, Senior Site Reliability Engineer at GetYourGuide, will share how his team put database performance at the center of their monitoring strategy. He will highlight how they identified and fixed slow queries, uncovered load balancing issues that drove significant cost savings, and built monitoring practices that improved both reliability and investigation workflows.

View Video

Datadog

Read more about The Hidden Bottleneck in Latency: GetYourGuide's Database Performance Journey

Use Grok parsing to extract fields from logs | Datadog Tips & Tricks

Nov 12, 2025 By Datadog In Datadog

When your logs don’t follow a standard format, it can be difficult to extract valuable information, like key-value pairs and nested JSON objects. Grok parsing lets you define flexible patterns that match unstructured log data so you can extract specific fields to query, filter, and visualize. In this video, you’ll learn how to: By refining your Grok parsers, you can make your logs more useful for analytics, dashboards, or alerts, and get even more value from your logs.

View Video

Datadog

Read more about Use Grok parsing to extract fields from logs | Datadog Tips & Tricks

Bits AI SRE, Flex Frozen, and GPU Monitoring | DASH 2025

Nov 6, 2025 By Datadog In Datadog

Get a first look at Datadog’s biggest product reveals from DASH 2025. Meet Bits AI SRE, your 24/7 autonomous AI Site Reliability Engineer, Flex Frozen for up to 7 years of managed log retention, and GPU Monitoring for full visibility into your AI workloads. Experience the future of observability in action.

View Video

Datadog

Read more about Bits AI SRE, Flex Frozen, and GPU Monitoring | DASH 2025

Triaging an Incident with a Critical Data Pipeline at #rivian

Nov 5, 2025 By Datadog In Datadog

Rivian makes electric vehicles to advance its mission to keep the world adventurous forever. As software defined vehicles, Rivian’s R1T and R1S are connected to the cloud from day 1, and telemetry data is at the heart of enabling mobile notifications, remote diagnostics, fleet management, and more. With so many critical pipelines in the cloud, observability is a top priority for the data platform.

View Video

Datadog

Read more about Triaging an Incident with a Critical Data Pipeline at #rivian

Operations | Monitoring | ITSM | DevOps | Cloud

Datadog Bits AI SRE: Your new teammate for on-call shifts

Introducing Bits AI SRE, your AI on-call teammate

Data Observability: Build confidence in the data life cycle

Explore Cloud Instance Pricing and Performance with Datadog Instance Explorer

Datadog GPU Monitoring: Optimize and troubleshoot AI infrastructure

Bringing Observability to Data

The Hidden Bottleneck in Latency: GetYourGuide's Database Performance Journey

Use Grok parsing to extract fields from logs | Datadog Tips & Tricks

Bits AI SRE, Flex Frozen, and GPU Monitoring | DASH 2025

Triaging an Incident with a Critical Data Pipeline at #rivian

Monthly Archive

Follow Us