Operations | Monitoring | ITSM | DevOps | Cloud

Kubernetes Networking at Scale: From Tool Sprawl to a Unified Solution

As Kubernetes platforms scale, one part of the system consistently resists standardization and predictability: networking. While compute and storage have largely matured into predictable, operationally stable subsystems, networking remains a primary source of complexity and operational risk This complexity is not the result of missing features or immature technology.

Canonical Ubuntu and Ubuntu Pro now available on AWS European Sovereign Cloud

January 15, 2026 – Canonical, the publisher of Ubuntu and provider of open source security, support, and services, announced today that it is a launch partner for the AWS European Sovereign Cloud, a new independent cloud for Europe, with Ubuntu and Ubuntu Pro now available. Canonical’s Ubuntu Pro delivers a securely designed, stable, and enterprise-ready foundation for open source innovation while providing customers with the same security, availability, and performance they expect from AWS.

"You Had One Job": Why Twenty Years of DevOps Has Failed to Do it

Let’s start with a question. What is DevOps all about? I’ll tell you my answer. In retrospect, I think the entire DevOps movement was a mighty, twenty year battle to achieve one thing: a single feedback loop connecting devs with prod. On those grounds, it failed. Not because software engineers weren’t good at their jobs, or didn’t care enough. It failed because the technology wasn’t good enough.

GitKraken Desktop 11.8: Visibility Where It Matters, Undo When It Doesn't

Some releases break new ground. Others clear the path. GitKraken Desktop 11.8 does both. You know that moment when you’re three commits deep into an interactive rebase and realize you’ve made a terrible mistake? Or when you’re trying to explain what changed on a feature branch, but it means manually selecting 47 commits? Or when you just want to preview a README without opening another app?

Getting started with on-call

Setting up on-call is simpler than it seems. It comes down to a few clear decisions about your team and what your service actually needs. This guide walks you through those decisions. You’ll learn who to add in your rotation, how long shifts should last, when to hand off, and what coverage makes sense for your service. By the end, you’ll know exactly how to set up your first schedule and move from ad-hoc firefighting to organized incident response.

What is Runtime Context? A Practical Definition for the AI Era

TLDR: Runtime Context is live, execution-level access to a running production system. It lets engineers and AI agents ask precise questions of running code and get answers immediately, without redeploying or interrupting users. This is the new baseline for reliability.

Fleet Management and Terraform: Use cases and best practices for managing collectors in Grafana Cloud

Earlier this year we launched Grafana Cloud Fleet Management to address the pain that comes with managing scores of telemetry collectors across departments and environments. We've been excited to see how organizations are using it to manage collectors at scale, but we've also heard from users who aren't sure how Fleet Management fits with their existing infrastructure-as-code tooling. The good news is Fleet Management is designed specifically to complement—not replace—tools like Terraform.

Paginating large datasets in production: Why OFFSET fails and cursors win

The things that separate an MVP from a production-ready app are polish, final touches, and the Pareto ‘last 20%’ of work. Many of the bugs, edge cases, and performance issues will come to the surface after you launch, when the user stampede puts a serious strain on your application. If you’re reading this, you’re probably sitting on the 80% mark, ready to tackle the rest.