Operations | Monitoring | ITSM | DevOps | Cloud

Sponsored Post

Five things your logs will never tell you

A customer escalation hit my queue when I was on the customer smoke jumpers team at an observability vendor. My team was the group that parachutes into Fortune 500 accounts one bad week from churning and usually after a big customer outage. The customer had filed a billing dispute three weeks earlier and their on-call engineers were stuck. They had our full stack: logs, metrics, traces, end-to-end instrumentation, every product we sold and some we didn't. They could see the request came in. They could see it returned a 500. They could not see the body. The trace was sampled out. The log line was truncated at 4KB.
Featured Post

From firefighting to forward planning: a practical route to operational innovation

Operational innovation is often treated as a back-office efficiency exercise, but in practice, it is becoming a strategic discipline. As AI moves deeper into day-to-day operations, technical leaders need a clearer way to cut toil, reduce risk and build the capacity to innovate. For many operations teams, it starts with incident management. When responders are trapped in noisy alert streams, manual escalations and fragmented workflows, innovation is pushed aside by the urgent work of keeping services available.

Who's in Charge? The 4 Key Pillars of AI Governance in 2026

You hire an astute, hard-working, fresh graduate to run things for you. You hand them the keys to everything in your company; that includes every system, every endpoint, every file, and every password, all of it. Your only instruction to them? "Go ahead and improve things!" Then, trusting in their competence, you leave them to it. Doesn't that sound like a recipe for disaster? Yet that's precisely what's happening in IT departments across the world.

How network change management could've prevented a costly switch misconfiguration

Unplanned outages often trace back to a simple but overlooked cause: an untracked configuration change. In many organizations, network device configurations are updated manually without approvals, documentation, or rollback plans. This lack of structure can lead to performance issues, downtime, and compliance risks. In this blog, we'll see how a core switch misconfiguration exposed the risks of unmanaged changes.

The AI bill arrived. Now what?

There was a time when “Opus” meant a classical composition and “Sonnet” was fourteen lines of Shakespeare you definitely did not read before the test. Now they’re model tiers, and every new release rewrites the economics of your engineering org whether you’re ready or not. Currently, your monthly total hides the crucial information you need to control and justify AI spend.

Governing AI Agents at Runtime: Open Source Zero-Trust with AGT | Ubuntu Summit 26.04

AI agents are moving from demos to production – but who governs what they do at runtime? The Agent Governance Toolkit (AGT) is an open source, MIT-licensed framework from Microsoft that enforces deterministic policy before every tool call, message, and action an agent takes. In this talk, Imran walks through how AGT brings zero-trust identity, policy-as-code, tamper-evident Merkle audit chains, and a Kubernetes sidecar model to any AI agent, regardless of framework.