Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

What Engineers Want from AI in Observability... According to the 2026 Observability Survey Report

The results show strong interest in AI for forecasting, root cause analysis, onboarding, and generating dashboards, alerts, and queries. But when it comes to autonomous action, practitioners are more cautious — and 95% say AI needs to show its work to earn trust.

Bridging the Gaps in Modern Operations: How Real-Time Messaging Improves System Reliability

In modern IT environments, reliability is no longer defined solely by system uptime or infrastructure resilience. It is equally shaped by how effectively systems, teams, and processes communicate under pressure. As architectures become more distributed and operations more complex, the gaps between tools, teams, and data streams have become one of the most persistent challenges in maintaining consistent performance.

Network Monitoring as Code

Tangling DNS, TCP handshake failures, packet loss: your network has blind spots that application-level dashboards miss. In this session, Daniel Paulus (VP Engineering, Checkly) sets up DNS, TCP, and ICMP monitors from scratch and deploys them as code using the Checkly CLI. You'll see how to import checks from the UI to a code project, use coding agents to build monitors, and debug network failures with Rocky AI, trace routes, and packet captures.

Product Update - March 2026

IncidentHub's latest product updates focus on improving the public status page, adding integrations with ticketing systems, private status page ingestion, and making the notifications more useful to the end user. Some of these improvements are driven by user feedback. Feedback is what makes the product better, and I am personally grateful to all our customers who have shared their feedback with us.

Flow State in an AI Workplace - Digital Friction 1:1 with Mike Lovewell

Tom welcomes Mike Lovewell to explore how digital friction continues to shape the modern workplace. From early days of low awareness to today’s complex, AI-influenced environments, Mike shares how friction has evolved in scale rather than cause. They discuss the growing importance of flow state, the measurable business impact of small disruptions, and why adoption—not just technology—is the key to success. AI emerges as both a solution and a new source of friction, depending on trust and usability.

Monitor schema health with engine.schema_fields: Structure, Drift, and Volatility

If you’ve worked with an observability pipeline, you’ve probably experienced schema problems: a field disappears, a type shifts from string to number, or a new label quietly appears. The causes are everywhere. Different teams adopt different naming conventions. A dependency upgrade changes the shape of a library’s log output. Over time, these small, reasonable decisions compound into schema sprawl: dashboards break, alerts misfire, and teams scramble to find out what happened.

The World's Best Infrastructure Teams Trust Kentik

Why do network and infrastructure teams at leading enterprises including Canva, Dropbox, Google ConocoPhillips, and ServiceNow choose Kentik? In their own words, customers describe epic cost savings, dramatic return on investment, and blockbuster efficiency improvements that only Kentik can deliver. Learn why Kentik is the must-see network intelligence solution any enterprise that depends on reliable connectivity.

Engineers Want AI in Observability - With One Catch: 4th Annual Observability Survey by Grafana Labs

Actually useful AI is welcome in observability. AI for the sake of AI is not. In this overview of Grafana Labs’ 4th annual Observability Survey, Marc Chipouras shares what 1,300+ respondents from 76 countries told us about the current state of observability — and what comes next. This year’s survey explores four major themes: The results show strong interest in AI for forecasting, root cause analysis, onboarding, and generating dashboards, alerts, and queries. But when it comes to autonomous action, practitioners are more cautious — and 95% say AI needs to show its work to earn trust.