Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Google Workspace outage on November 12: How StatusGator detected it first

On November 12, 2025, users around the world faced difficulty accessing Google Workspace products including Google Drive, Google Docs, Google Sheets, and Google Slides. While the outage did not impact every user, it was widespread and disruptive. StatusGator detected the incident early using real user data and issued an Early Warning Signal long before Google officially acknowledged the issue.

The Hidden Bottleneck in Latency: GetYourGuide's Database Performance Journey

Fast front-end and back-end code alone won’t guarantee low end-to-end latency as hidden bottlenecks in the database can undermine even the best engineering efforts. In this session, Oleksii Serhiienko, Senior Site Reliability Engineer at GetYourGuide, will share how his team put database performance at the center of their monitoring strategy. He will highlight how they identified and fixed slow queries, uncovered load balancing issues that drove significant cost savings, and built monitoring practices that improved both reliability and investigation workflows.

From Error to Fix: AI-Powered Debugging with Sentry and GitHub

​This session will focus on the agent based features of Sentry for debugging an issue in a web application. We'll move through the broken issue - and show how tools like Sentry Seer and the GitHub repo integration make it easy to determine the root cause of an issue by bringing all the context of Sentry and code in GitHub together, and how the Sentry MCP makes it easy to pull all that context down into GitHub CoPilot to fix it locally.

LogicMonitor Named to CRN's 2025 Edge Computing 100: Proof That the Edge Finally Has Some Brains

Edge computing has been the buzzword of the decade. Everyone is talking about pushing intelligence closer to the edge, but most of that intelligence still needs a map and a flashlight. This week, CRN named LogicMonitor to its 2025 Edge Computing 100, recognizing companies that are actually doing something useful at the edge instead of just hyping it. We are honored. We are also a little amused.

Elastic named a Leader in the IDC MarketScape: Worldwide Observability Platforms 2025 Vendor Assessment

We're proud to share that Elastic has been named a Leader in the IDC MarketScape: Worldwide Observability Platforms 2025 Vendor Assessment (doc, November 2025). We believe this recognition validates our ongoing mission: to deliver an observability platform that is open, extensible, and AI-driven to power full-stack observability that unifies operational and business data at scale, allowing SRE teams to move from detect and resolve problems faster.

Expanding Access, Not Risk: Using the Read-Only Role in Honeycomb Teams

Observability works best when everyone who needs visibility can get it without the risk of unintentional changes. Honeycomb’s role-based access control system helps teams strike that balance with a selection of Owner, Member, and Read-Only member roles. This control gives teams more flexibility in how they share access across their organization, helping you scale visibility safely without sacrificing control.

Beyond Isolated AI: How the Selector MCP Server Connects Agents, Context, and Action

AI in network operations is evolving faster than ever. But while new models and agents are emerging almost daily, they’re often working alone, with each confined to its own context, data, and domain. One model might analyze telemetry, another handles automation scripts, and a third generates summaries or recommendations. Each model might be intelligent on its own, but without a way to share context, they end up thinking in isolation, limiting what they can achieve together.

Bringing Observability to Data

While observability practices have evolved in recent years, they have largely focused on application services and infrastructure. Yet it is data what powers our applications, businesses, and AI models. When data issues occur, the consequences can be far reaching, from poor product experiences to billing errors to misinformed AI outcomes. In this session, Jonathan Morin, Group Product Manager at Datadog, shares real-world examples of incidents and explains how data observability can address them, helping teams detect issues earlier, reduce costly downtime, and restore trust in their data.