Operations | Monitoring | ITSM | DevOps | Cloud

Ameet Talwalkar on Building the AI Research Lab

"We're doing cutting-edge AI, focused on real translational impact: getting our research over the wall and into production." Ameet Talwalkar, Datadog's Chief Scientist, shares what it took to build the AI Research Lab from the ground up — and what makes DAIR different from traditional research teams. At Datadog, research ships. Recent work from the lab includes Toto 2.0, open-weights time series forecasting models ranked on leading benchmarks, and ARFBench, a new benchmark for evaluating AI on real incident data.

Measure the real impact of AI coding tools on software delivery with Datadog AI Impact

Engineering teams have rapidly adopted AI coding tools, but organizations still struggle to understand their impact. Existing dashboards focus on activity, such as daily active users, acceptance rates, or lines of generated code, but these metrics don’t answer a more important question: Are teams actually shipping more, faster, and with fewer issues?

How to measure developer experience (DevEx) in the AI era

As AI coding assistants dramatically inflate PR counts, commit frequency, and lines of code, the limitations of individual output metrics have never been more apparent. A developer can now produce significantly more lines per session, but higher volume doesn’t guarantee that the code is stable, maintainable, or successfully running in production. GitClear analyzed over 200 million lines of code and found that code churn nearly doubled following widespread AI adoption.

Project and manage cloud spend with Datadog budget forecasting

Cloud and SaaS spending continues to grow across teams, services, and providers, changing too quickly for retrospective cost management workflows to keep up. Finance and engineering leaders often rely on last month’s reports or manually maintained spreadsheets, which don’t reflect current usage. As a result, teams lack context on how spend is trending and often discover budget overruns only after they’ve occurred.

How to audit and clean up monitors effectively

Alert fatigue and blind spots develop together. Monitoring stacks that generate noise while missing critical issues may have incomplete coverage or poorly configured alerts. As they grow reactively and without structured coverage assessment, both issues worsen. Teams will often add monitors when something breaks and tune thresholds when alerts become unbearable, but rarely audit their overall setup to see if it works.

How we made a SQL query optimization agent 59% more accurate using autoresearch and LLM Observability

Without experiment infrastructure to help you test your LLM applications, every research session starts with the same questions: What have we tried previously? What were the numbers? Which prompt version produced that result? Why did we discard that approach? The answers live in scattered notes, terminal history, and half-remembered conversations. Each handoff between sessions loses context. In practice, iteration can slow down as teams get bogged down in testing and analysis.

Explore Datadog metrics with Natural Language Queries

Metric exploration often begins with a simple question, but answering that question can require deep familiarity with metric names, tag structures, and query syntax. Experienced users spend time refining queries through trial and error, and newer users struggle to get started. As a result, teams face delays in troubleshooting and analysis. Valuable observability data, including metrics that are difficult to discover and query, also goes underused.

Diagnose slow PostgreSQL queries faster with explain plan correlation

When a PostgreSQL query runs slowly, engineers often start with EXPLAIN ANALYZE. The output is a tree of plan nodes, each one describing a step the database took to execute it. A query with several joins and a subquery can produce 20 or more nodes. But the plan gives no visual indication of which node corresponds to each clause in the SQL text. Diagnosing the problem means viewing the plan in one window and the query in another, manually tracing connections between them.

Attribute AI costs across providers with Datadog Cloud Cost Management

AI adoption is accelerating across organizations, and spending often follows a similar pattern: rapid growth, multiple providers, and limited visibility into where costs originate. Each provider exposes billing data differently, with distinct schemas, dimensions, and interfaces. FinOps and engineering teams often spend significant time consolidating fragmented data, only to end up with partial attribution and limited context about who or what generated the AI spending.