Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

How it feels to run an incident with AI SRE

We've been building the broader incident.io platform for several years now, and one thing we've learned is that UX matters more here than almost anywhere else. When an incident fires, there's no room for poorly designed interfaces or fumbling through features you haven't touched in a while. The product has to be ergonomic: easy to pick up, easy to navigate, with the right things at your fingertips at exactly the right moment. We've put a lot of effort into this over the last 5 years.

Why Your PromQL Availability Query Returns Nothing When Services Are Healthy

Your SLI query shows 100% availability as No Data. Here's why PromQL returns empty results instead of zero — and the label-preserving fix. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

Canonical releases Ubuntu 26.04 LTS Resolute Raccoon

Today Canonical announced the release of Ubuntu 26.04 LTS, codenamed “Resolute Raccoon,” available to download and install from ubuntu.com/download. Resolute Raccoon builds on the resilience-focused improvements introduced in interim releases, with TPM-backed full-disk encryption, improved support for application permission prompting, Livepatch updates for Arm– based servers, and Rust-based utilities for enhanced memory safety.

AI for Incident Response: Should You Build or Buy?

SREs and platform teams are overwhelmed by the effort of manually troubleshooting ever-more complex cloud-native environments. This pain is driving a breakneck adoption of AI SRE solutions that promise to automate core reliability practices, from root cause analysis to capacity planning. For teams with strong engineering talent, creating a DIY AI SRE seems like a straightforward challenge.

What to expect from a database monitoring vendor: looking beyond the tool

Part 2: Key insights from a fireside chat with Chris Yates. Read part 1 here. Choosing a database monitoring vendor isn't just about features. Once you’re confident that it’s time to reassess your database monitoring strategy, the natural instinct is to start comparing products. However, it’s vital to know how to assess vendor relationships, support quality, and product innovation before you sign anything.

UK Cyber Essentials is Raising the Bar. Governance is How Teams Keep It There.

The April 2026 update to UK Cyber Essentials marks an important shift. Not because it introduces radically new security concepts, but because it removes tolerance for inconsistency. With the effective date quickly approaching, many UK organizations are focused on meeting the immediate requirements. That matters. But the more durable story is what these changes reveal about how security and compliance are now expected to operate in real world environments.

Test network paths with TCP, UDP, and ICMP in Datadog

When developers and SREs design application tests, they often prioritize user workflows and API availability. Extending that suite with network tests that match your app’s traffic protocols can reveal whether issues originate in the network or application layer. In this post, we’ll explore how you can design effective network tests using the Transmission Control Protocol (TCP), User Datagram Protocol (UDP), or Internet Control Message Protocol (ICMP), including.

Introducing Ubuntu 26.04 LTS | Resolute Raccoon

Ubuntu 26.04 LTS, codenamed, is now available to download. Resolute Raccoon builds on the resilience-focused improvements introduced in interim releases, with TPM-backed full-disk encryption, improved support for application permission prompting, Livepatch updates for Arm-based servers, and Rust-based utilities for enhanced memory safety. This release also brings native support for industry-leading AI/ML toolkits like NVIDIA CUDA and AMD ROCm, making Ubuntu 26.04 LTS the ideal platform for AI development and production workloads.