%term

Komodor Introduces Extensible, Autonomous Multi-Agent Architecture for AI-Driven Site Reliability Engineering

Mar 18, 2026 By Komodor In Komodor

Out-of-the-box and bring-your-own AI agents that encode operational knowledge boost troubleshooting speed and accuracy across cloud native infrastructure TEL AVIV and SAN FRANCISCO, March 18, 2026 — Komodor, the autonomous AI SRE company for cloud-native infrastructure, today announced a new extensibility framework that transforms its Klaudia AI technology into a universal multi-agent platform for troubleshooting and optimizing performance of complex cloud native infrastructures and applications.

Read Post

Komodor

Read more about Komodor Introduces Extensible, Autonomous Multi-Agent Architecture for AI-Driven Site Reliability Engineering

FinOps in the Age of Kubernetes: When Everyone Owns the Bill

Mar 15, 2026 By Ilan Adler In Komodor

A FinOps analyst walks into a Monday morning meeting with a detailed spreadsheet showing $2.3M in potential Kubernetes cost savings. The recommendations look straightforward: reduce memory limits by 40%, scale down replicas during off-peak hours, consolidate workloads onto fewer nodes. The numbers are compelling, the methodology is sound, and the savings would make a material impact on quarterly cloud spend. The SRE team immediately objects.

Read Post

Komodor

Read more about FinOps in the Age of Kubernetes: When Everyone Owns the Bill

AI SRE in Practice: Enabling Non-Experts to Troubleshoot Kubernetes

Mar 4, 2026 By Itiel Shwartz In Komodor

Kubernetes troubleshooting traditionally requires deep platform expertise. Understanding pod lifecycle, decoding error messages, correlating events across resources, and identifying root cause all demand experience that takes years to build. This expertise gap creates a bottleneck where only senior engineers can handle production issues, limiting how quickly teams can resolve incidents.

Read Post

Komodor

Read more about AI SRE in Practice: Enabling Non-Experts to Troubleshoot Kubernetes

When AI Writes the Code, Who Pays the Cloud Bill?

Mar 1, 2026 By Ilan Adler In Komodor

This is part two of a series of the implications of AI generated code becoming mainstream. We recently wrote about how AI-generated code is overwhelming SRE teams with production complexity they can’t manage. Turns out that’s only half the problem. The other half shows up on the cloud bill. A prospect reached out to us last month. They’d been using Cursor and Claude Code for six months, shipping features at unprecedented velocity. Product was thrilled.

Read Post

Komodor

Read more about When AI Writes the Code, Who Pays the Cloud Bill?

[Webinar] Conquering the Complexity of Self-Hosted Apps with Agentic AI SRE

Feb 26, 2026 By Komodor In Komodor

Most enterprise SaaS products, like Komodor’s Autonomous AI SRE Platform, require installing a remote agent on the customer’s infrastructure, which varies significantly from one organization to another, in terms of architecture, configurations, permissions, processes, and more. This “unmanaged” model creates major blind spots, making daily operations, observability, debugging, and incident response challenging. When failures occur, limited visibility and bespoke systems make root-cause analysis slow, incomplete, or impossible.

View Video

Komodor

Read more about [Webinar] Conquering the Complexity of Self-Hosted Apps with Agentic AI SRE

When AI Writes the Code, Who Keeps Production Running?

Feb 23, 2026 By Ilan Adler In Komodor

The production environment has become a minefield of code nobody really understands. Here’s what’s happening: Development teams are using Claude Code, Cursor, and GitHub Copilot to ship features at 10x their previous velocity. Product managers are ecstatic. Business stakeholders are thrilled. And somewhere in a war room at 2:17 AM, an SRE is staring at a stack trace for code that was AI-generated three weeks ago, trying to figure out why the payment service just fell over.

Read Post

Komodor

Read more about When AI Writes the Code, Who Keeps Production Running?

AI SRE in Practice: Accelerating Engineer Onboarding with Contextual Expertise

Feb 22, 2026 By Itiel Shwartz In Komodor

Onboarding new engineers to complex Kubernetes environments is expensive. Junior engineers need to learn cluster architecture, understand organizational conventions, navigate internal documentation, and build relationships with senior team members who can answer questions. The process takes weeks or months, and during that time, senior engineers spend significant time mentoring instead of working on complex problems.

Read Post

Komodor

Read more about AI SRE in Practice: Accelerating Engineer Onboarding with Contextual Expertise

AI SRE in Practice: Diagnosing AWS CNI IP Exhaustion Before Widespread Outage

Feb 16, 2026 By Itiel Shwartz In Komodor

IP address exhaustion in Kubernetes doesn’t announce itself with clear error messages. Pods fail to schedule, services degrade unpredictably, and the symptoms look like a dozen different problems before anyone realizes the cluster has run out of available IP addresses. By the time the root cause becomes clear, multiple services are affected and recovery requires coordination across infrastructure layers.

Read Post

Komodor

Read more about AI SRE in Practice: Diagnosing AWS CNI IP Exhaustion Before Widespread Outage

#053 - The Road to Distributed AI and Kubernetes Infrastructure with Matt Butcher (Fermyon) & Ari...

Feb 13, 2026 By Komodor In Komodor

They share their professional origins, highlighting how Kubernetes transitioned from a complex tool for experts to a foundational technology for global enterprises.. Part of the conversation focuses on the history of Helm, explaining its growth from a simple hackathon project into a standard package manager. Another part takes on the future of distributed computing, specifically how Akamai is integrating infrastructure as a service to support modern workloads.

View Video

Komodor

Read more about #053 - The Road to Distributed AI and Kubernetes Infrastructure with Matt Butcher (Fermyon) & Ari...

AI SRE in Practice: Tracing Policy Changes to Widespread Pod Failures

Feb 9, 2026 By Itiel Shwartz In Komodor

Policy changes in Kubernetes are supposed to improve security, enforce standards, or optimize resource usage. But when a policy change triggers cascading pod failures across multiple namespaces, the investigation becomes a race to identify what changed before more workloads are affected.

Read Post

Komodor

Read more about AI SRE in Practice: Tracing Policy Changes to Widespread Pod Failures

Operations | Monitoring | ITSM | DevOps | Cloud

Komodor Introduces Extensible, Autonomous Multi-Agent Architecture for AI-Driven Site Reliability Engineering

FinOps in the Age of Kubernetes: When Everyone Owns the Bill

AI SRE in Practice: Enabling Non-Experts to Troubleshoot Kubernetes

When AI Writes the Code, Who Pays the Cloud Bill?

[Webinar] Conquering the Complexity of Self-Hosted Apps with Agentic AI SRE

When AI Writes the Code, Who Keeps Production Running?

AI SRE in Practice: Accelerating Engineer Onboarding with Contextual Expertise

AI SRE in Practice: Diagnosing AWS CNI IP Exhaustion Before Widespread Outage

#053 - The Road to Distributed AI and Kubernetes Infrastructure with Matt Butcher (Fermyon) & Ari...

AI SRE in Practice: Tracing Policy Changes to Widespread Pod Failures

Monthly Archive

Follow Us