Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on APIs, Mobile, AI, Machine Learning, IoT, Open Source and more!

The Technical Architecture Behind Automated Video Generation Systems

I spent several weeks last year reverse-engineering how automated content pipelines actually work. Not because I wanted to build one necessarily. But because the proliferation of AI-generated video content raised questions I could not answer without understanding the underlying systems. How do these pipelines function? What are their actual capabilities and limitations? Where does technology stand today?

Agentic AI Essentials: Examining the Hype Around Agentic AI

In the first article of our Agentic AI Essentials series, we’ll establish what makes agentic AI distinct. We’ll look at the process of tool calling and examine how agentic systems convert intelligence into action. We’ll also explore the human fears, pressures, and ambitions that fuel the hype around agentic systems. By sorting the signal from the noise, IT decision-makers can take the first step toward making sound decisions around agentic AI adoption.

The 54% Improvement Playbook: How Top Performers Integrate GenAI into ITSM

Don't just read the report—learn how to replicate its most impressive results. In our 2025 State of ITSM Report, a select group of top-performing organizations achieved a staggering 54.3% reduction in resolution time by strategically integrating GenAI. This live session moves beyond the data to share their playbook. We'll provide a step-by-step guide on how to pair GenAI with foundational ITSM practices and demonstrate how to weave these tools into your team's daily workflows to achieve maximum efficiency.

Supercharge your LLM Using Production Data Context

Are your LLM coding agents (like Cursor or Claude Code) hallucinating fixes because they don't know what's actually happening in production? In this video, Matt from Speedscale shows you how to bridge the gap between your local IDE and live production traffic using the Model Context Protocol (MCP). Most observability tools just give you telemetry. Speedscale’s MCP server gives your agent the "inner workings" of actual API calls and payloads, so it can check its assumptions against reality. No more "vibe-coding" and hoping it works; let your agent find the 500 errors and rate limits for you.

How To Calculate Your OpenAI Cost Per API Call (And Why It Matters Now)

OpenAI doesn’t bill per feature, per customer, or per transaction. It bills per token, across multiple models, with usage patterns that can change by the hour. As a result, two API calls that support the same feature can have very different costs. Without a clear way to translate token-level pricing into something product, engineering, and finance teams can reason about, AI spend becomes difficult to forecast and harder to control.

Automate flaky test fixes with the Bits AI Dev Agent and Test Optimization

Flaky tests are a significant source of inefficiency that impacts many engineering teams. Along with failing your build, they interrupt your entire development flow, generate excessive CI/CD noise, and, critically, compromise developer trust in the test suite itself. Datadog Test Optimization enables you to manage test suites at scale by pinpointing the flakiest tests, analyzing their history across hundreds of runs, and automatically surfacing the root cause.

How we built an AI SRE agent that investigates like a team of engineers

We built Bits AI SRE to help engineers investigate and solve production incidents, one of the most difficult aspects of operating distributed systems today. As environments grow more dynamic and complex, resolving issues becomes more challenging. Failures now span more services, involve noisier signals, and encompass larger volumes of telemetry data, making it hard for on-call engineers to find root causes quickly. Today, Bits AI SRE is already helping teams decrease time to resolution by up to 95%.