Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Mastering AI Spend With CloudZero And LiteLLM

The AI landscape today feels a lot like the early days of the cloud: exciting, fast-moving, and completely fragmented. Every week, engineering teams are experimenting with dozens of large language models (LLMs) from providers like OpenAI, Anthropic, Google, Mistral, Meta, and beyond. They’re tweaking prompts, testing model performance, swapping context windows, and even running multiple models in parallel to figure out which one works best for each unique use case.

Reliability at Scale: A Conversation with DevOps Leader Ivan Battimiello

For more than a decade, Ivan Battimiello has been building and scaling distributed engineering systems across Europe and the United States. With experience ranging from game development to full-stack engineering and DevOps leadership, he has led operational transformations for global teams, implemented modern reliability frameworks, and introduced advanced automation practices that dramatically reduced system failures.

Staging Environments Explained: Why Staging Is Essential for Safe, Reliable Software Releases

A staging environment is the final checkpoint before any software update goes live, a production-like space where bugs, performance issues, and integration failures can be caught before they impact real users. In this video, we break down what a staging environment is, why it’s critical, and how it helps ensure smooth, predictable deployments.

Monitor Everything is an Anti-Pattern!

Bullshit and nonsense. But let’s take it from the beginning. The industry’s story goes something like this: Then, in the same breath: You see the contradiction already, right? The same industry that tells you “collect less, simplify, trust the experts” is also the industry where: This isn’t an observability strategy. It’s observability by hindsight. Right. Good. Now we’re having fun.

ShipTalk S4E5 | How to Build Real-World ML for 2D Drawings | Marina Petzel (Senior ML Engineer)

What does it actually take to ship AI into a 40+ year old product used by millions of professionals? In this episode of ShipTalk, Dewan Ahmed (Principal Developer Advocate, Harness) chats with Marina Petzel, Senior ML Engineer and AI Productivity Lead at Autodesk, about building and shipping practical AI, not just flashy demos.

Ubuntu Summit 25.10 | Opening remarks

Canonical's Founder and CEO, Mark Shuttleworth, welcomes the attendees of the Ubuntu Summit 25.10. He highlights the interdependence of the open source ecosystem and the role of Ubuntu as both an aggregator and an innovator. He also discusses key partnerships across silicon, cloud, ISVs, and the Ubuntu community, and introduces a new global grassroots strategy leading into future summits.

9 Monitoring Tools That Deliver AI-Native Anomaly Detection

The observability market has moved beyond manual threshold-setting. Modern platforms use statistical algorithms, machine learning, and causal AI to detect anomalies automatically. Some work immediately after deployment. Others train on your data for better accuracy. Each approach has technical trade-offs worth understanding. This guide compares how nine monitoring solutions handle automated anomaly detection and root cause analysis.

Make Data-Driven Decisions with Warehouse Native Experimentation

As organizations accelerate their AI-driven development, the need for trustworthy and transparent experimentation is greater than ever. Warehouse Native Experimentation keeps analysis where the data already lives, enabling teams to validate features with metrics and reliable SQL logic. The result is faster iteration with less risk, and decisions rooted in the same source of truth the business already trusts.