Operations | Monitoring | ITSM | DevOps | Cloud

Agentic AI at Scale: Building the Kubex Agentic AI Platform

In the modern cloud infrastructure landscape, we don’t have a data problem; we have an actionable interpretation gap. Engineering teams are often drowning in metrics that describe a crisis without providing a clear path to remediation. Traditional FinOps, SRE, and DevOps work has become a reactive loop of dashboard-watching and manual firefighting.

How to Catch AI Code Mistakes Before They Reach Production

AI can write code fast, but it makes mistakes humans often don't. In this session from Ole Lensmar, CTO of Testkube, breaks down the real quality risks of AI-generated code and how engineering teams can build guardrails before those bugs hit production. What you'll learn: Common mistakes LLMs make (and which ones are unique to AI) Whether you're a developer leaning on AI to ship faster or a QA lead trying to keep up with the pace of AI-generated code, this talk gives you a practical framework for staying ahead of quality issues.

Claude Code is running bash commands on your infrastructure. Here's how to watch it.

I’ve been staring at Claude Code telemetry for the past few weeks, and I keep noticing the same thing: most teams drop it into their environment, say “it’s amazing,” and have absolutely no idea what it’s actually doing at the system level. That’s fine for a personal dev tool. It’s not fine when you’ve rolled it out to 50 engineers.

Architecting MCP for AI Agents: Lessons from Our Redesign | Harness Blog

-- Key Takeaways: The Harness MCP server is an MCP-compatible interface that lets AI agents discover, query, and act on Harness resources across CI/CD, GitOps, Feature Flags, Cloud Cost Management, Security Testing, Resilience Testing, Internal Developer Portal, and more. -- The first wave of MCP servers followed a natural pattern: take every API endpoint, wrap it in a tool definition, and expose it to the LLM.

Claude Code + Lightrun MCP: Your AI Agent Now Has Live Runtime Vision

Claude Code, Anthropic’s coding agent, now integrates with Lightrun through MCP. AI code assistants have been flying blind. Google Dora’ 2025 report found it is causing, an almost 10% increase in code instability. Even with up to 1M tokens of context available in Claude, this powerful agenti cannot see how the code it writes actually behaves inside a live system under real traffic, real dependencies, and under a load of 10,000 requests per second.

AI Assistant for Calico: Troubleshooting at the Speed of Thought

Despite the wealth of data available, distilling a coherent narrative from a Kubernetes cluster remains a challenge for modern infrastructure teams. Even with powerful visualization tools like the Policy Board, Service Graph, and specialized dashboards, users often find themselves spending significant time piecing together context across different screens.

What Engineers Want from AI in Observability... According to the 2026 Observability Survey Report

The results show strong interest in AI for forecasting, root cause analysis, onboarding, and generating dashboards, alerts, and queries. But when it comes to autonomous action, practitioners are more cautious — and 95% say AI needs to show its work to earn trust.

The Hidden Failure Points in Your AI Strategy

New models, new agents, new capabilities. It seems like every week there’s a new must-have AI function. It’s no surprise that leaders are feeling pressure to move quickly. At a PagerDuty on Tour event, a customer joked that they couldn’t fathom having a five-year AI strategy; it makes way more sense to have a five-minute one. There’s truth in that comment.