Operations | Monitoring | ITSM | DevOps | Cloud

Stopping Kubernetes cloud waste: agentic automation for enterprise fleets

Agentic Kubernetes resource reclamation is the practice of using an autonomous control plane to continuously identify, suspend, and delete idle infrastructure across a multi-cloud Kubernetes fleet. It replaces manual cleanup and reactive autoscaling with intent-based policies that act on business state, eliminating the configuration drift and cloud waste typical of unmanaged fleets.

Building an agentic content production system with Claude Code

This post by an engineer explains how his team uses the.claude folder in Claude Code. The folder is the hidden directory where you store context files, behavioral rules, and automated workflows so Claude understands how to operate in a specific project. He’d set up coding conventions, tool configs, CI integrations. Very engineering-brained. The tool is called Claude Code, so fair enough. I run a web and content team. We write blog posts, tutorials, and technical guides for a living.

The quiet problem underneath modern software delivery: database change at scale

Application delivery has accelerated over the last decade. Modern CI/CD pipelines, automated testing, and cloud infrastructure have already raised the baseline. Now AI-assisted coding tools are compressing timelines further still - developers are writing and shipping code faster than ever.

(AusBiz) JFrog teams up with Nvidia to manage AI agents

AI agents are making real-time decisions inside enterprises right now; pulling code, accessing tools, executing tasks. But most businesses have zero visibility into what those agents are actually using. In this interview on @ausbizTV, Sunny Rao, SVP APAC at JFrog, explains why the governance gap is one of the biggest risks facing enterprises today; and how JFrog and NVIDIA are building the trust layer to fix it.

From Stack Trace to Probable Cause: AI Root Cause Analysis Is Here

You know the drill. An error fires, you get the stack trace, and then you spend the next 45 minutes tracing it backward through four services, two config files, and a deploy that happened three hours ago. You eventually find the root cause, but the path to get there was manual, slow, and entirely dependent on how well you already knew the codebase. We built AI-powered root cause analysis (RCA) for that kind of slog.

AI Factories Will Be Won on Efficiency: Why the Kubex + Rafay Partnership Matters

The early era for AI was defined by experimentation, standing up isolated environments, and finding the first practical use cases. Today, the conversation is different. Enterprises are no longer asking whether AI matters. They are asking how to scale it sustainably, securely, and economically. That shift is giving rise to the AI factory: a repeatable, governed, production-ready environment where data scientists, platform teams, and application teams can build, train, deploy, and operate AI at scale.

Optimizing the OpenTelemetry Python SDK for LLM Workloads

Agentic workloads thrive with precision tooling. Just like developers, they need the rich context, high cardinality, and fast feedback loops that allow them to ask exploratory open-ended questions of their code. But instrumentation is costly, and from the dawn of software, developers have tried to do the most possible with the least amount of resources.

Your AI Agents Are Only As Good As Your Data | Harness Blog

Every agent demo follows the same arc. The agent calls an API. A deployment triggers. A ticket gets created. The audience is impressed. Then someone asks a real question: "Which regions had the highest order failure rate this quarter, and are any of them linked to vendor SLA breaches?" That question crosses four entity types — orders, fulfillment records, vendors, SLA contracts.