Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How Kotak811 Revolutionized Digital Banking Observability with Coralogix

Kotak811, the digital-first engine of Kotak Mahindra Bank, is a banking platform serving over 23 million users across India. Since its launch in 2017, Kotak811 has transformed into the bank’s primary growth driver, now accounting for 70% of all new customer acquisitions. The platform is widely recognized for offering a paperless, mobile-first experience, providing everything from instant zero-balance accounts to seamless UPI payments and investment tools.

Meet AURA: The Open-Source Agent Harness for Production AI : Autonomous Incident Response Demo

Watch AURA autonomously respond to a production incident in real time—from building its reasoning context and querying PagerDuty and ClickHouse, to triggering a human-in-the-loop approval with the on-call SRE, to removing the stuck pod and validating remediation. Every behavior is defined in a simple config. AURA is Mezmo's AI-powered incident response agent built for platform engineers and SREs managing high-volume telemetry pipelines.

See how Mezmo's AI Assistant instantly pinpoints root causes

This video shows how Mezmo's AI Assistant turns noisy telemetry into clear answers when errors spike. By preprocessing data and surfacing only the most relevant patterns, Mezmo quickly identifies issues like database connection failures or resource shortages and delivers actionable recommendations. Watch how AI-powered root cause analysis helps teams troubleshoot faster and with confidence. Mezmo's AI Assistant is built for platform engineers and SREs who need fast, reliable root cause analysis across high-volume telemetry pipelines — without manually sifting through noise.

How Mezmo Uses Active Telemetry for Faster AI Root Cause Analysis

AI-powered root cause analysis only works when the data going into the model is clean, relevant, and structured. In this demo, we show how Mezmo's Active Telemetry approach helps engineers and SREs move from noisy application errors to immediate clarity. Using a restaurant ordering application running in Kubernetes, we trigger a database connection pool exhaustion issue and walk through two ways to investigate it with Mezmo.

LiveTail: Real-Time Visibility for Active Telemetry

See how Mezmo LiveTail helps teams move from passive log search to active, real-time investigation. In this demo, you'll watch live telemetry stream across services and environments, identify emerging issues as they happen, and use real-time context to troubleshoot faster before signals are delayed, buried, or lost in the noise. LiveTail is part of Mezmo's Active Telemetry platform — built for platform engineers and SREs who need immediate visibility into what's happening across their stack right now, not after the fact.

Connecting Agents for Real-Time Root Cause Analysis with Checkly's Rocky AI

Rocky, Checkly's AI agent, monitors production sites and provides an analysis for every failing check. Previously, a coding agent couldn't access this analysis, leaving incidents and agents disconnected. Now, you can access all the analyses via the Checkly CLI (or API) and tell your coding agent, "Hey, I got a Checkly alert. Please investigate!" With Rocky's structured analysis delivered inline, the coding agent can start with a strong hypothesis, fix issues, and propose a PR in one session.

From Vibes to Signals: Observing Your AI Coding Workflow

Agentic coding tools like Claude Code and Codex have taken centre stage and inserted themselves into the critical path of software development. This shift has happened fast, and for most teams, the visibility hasn’t caught up. Until now we’ve been evaluating our vibe coding the same way – on vibes. You might say “this feels faster” or “that seems like a better approach”. That’s not going to scale.