Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How Fabrix.ai Agents Ensure Data Privacy & Security

As Agentic AI moves into enterprise environments, IT and security leaders face a critical challenge on how to leverage advanced LLMs without exposing sensitive data, intellectual property, or proprietary configurations to the cloud. You cannot build a self-driving, autonomous IT infrastructure if your security team blocks the deployment, and that’s exactly why the Fabrix.ai platform features an Enterprise-Grade LLM Integration architecture anchored by our built-in Data Security layer.

Shopify outage on February 15, 2026

On February 15, 2026, Shopify experienced a widespread service disruption that impacted merchants and shoppers around the world. While the provider did not acknowledge the issue until 15:36 UTC, StatusGator’s Early Warning Signals detected unusual activity and alerted customers at 15:00 UTC, just minutes after the first outage reports began coming in. This incident highlights the importance of independent, real time monitoring.

SendGrid Status Monitoring: How to Track Email Delivery Outages

When SendGrid goes down, your transactional emails stop reaching customers. Password resets fail. Order confirmations vanish. Support tickets never arrive. By the time you notice, customers are already complaining. For DevOps and SRE teams, checking SendGrid status shouldn't be a manual process. It shouldn't wait until customers report it either. For a team sending 10,000 transactional emails per day, a 15-minute outage means roughly 100 emails that never arrived.

AI Agents in IT Operations: From Concept to Practical Value

Artificial intelligence has been a defining theme in IT operations for nearly a decade. Early AIOps initiatives focused on predictive analytics and anomaly detection, promising to reduce operational overhead and improve system reliability. While these capabilities delivered incremental value, they often fell short of transforming how operations actually functioned.

The Definitive AWS Outage Report 2025: Reliability Analytics and Cascade Impact

Amazon Web Services remains one of the most popular cloud providers, with 200+ services in 39 regions across the world. Like all providers, they have their share of outages. In 2025, IncidentHub detected 38 AWS outages, of which the one on October 20th had the most widespread impact affecting hundreds of SaaS providers simultaneously. Payments were disrupted, students lost access to classrooms, developer tooling degraded, and some IT teams experienced alerting gaps.

From RCA to Autonomous Ops: The Future of AI in Observability | Big Tent S3E7

SREs are famously skeptical of AI — so how do you convince them to trust agents in production? In this episode of Grafana’s Big Tent, Tom Wilkie talks with Spiros Xanthos (Resolve AI), Manoj Acharya (Grafana Labs), and Cyril Tovena (Grafana Assistant team) about agent-first observability. They unpack knowledge graphs, LLM reasoning, autonomous debugging, pricing models, and the “Claude Code moment” for observability. Is autonomous production ops closer than we think?

The rise of agentic AI in production: Can observability systems run themselves?

Sometimes the biggest shifts in technology aren’t about collecting more data — they’re about who (or what) gets to act on it. In this episode of “Grafana’s Big Tent” podcast, host Tom Wilkie, Grafana Labs CTO, is joined by Spiros Xanthos, Founder & CEO of Resolve AI, Manoj Acharya, VP of Engineering for Observability at Grafana Labs, and Cyril Tovena, Principal Engineer on the Grafana Assistant team, to discuss agentic AI in observability.

The Grafana Labs operating system: Introducing our Guiding Principles

Matt Toback is the VP of Culture at Grafana Labs. We published our original company values back in December 2020. We were a young company, growing fast, and fully remote. Our values at the time were aspirational, and painted a picture of the kind of company we wanted to be. Those values did real work and they mattered. You could hear them used in everyday conversations, and they helped get us to where we are today. But growth has a way of revealing gaps.