Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Breaking through the Senior Engineer ceiling

You’ve made it to Senior engineer. Now what? You’re now staring at the next level, Staff typically, sometimes Principal, or whatever your company calls it. The path feels murky. Your manager gives you feedback like “show more technical leadership” or “think bigger picture”, but what does that actually mean day-to-day? I’ve been there. I’ve also been on the other side, helping engineers grow through whatever explicit (or implicit) levels a company has.

Vibe coding with the incident.io API

Many, many years ago, I was a computer science major at the University of Illinois, hoping someday I’d be able to write code for a living. I started my career in QA hoping to learn the ins and outs of software development. But it turns out I wasn’t very good at coding. I was just good enough to get a role as a sales engineer, where all I had to do was write code that could hold together for 30 minutes in a demo.

Top 5 EdTech outages detected by StatusGator in July 2025

July 2025 saw several significant service disruptions affecting the education technology (EdTech) ecosystem. From online learning platforms to creative tools used by teachers and students, these outages caused widespread frustration. StatusGator monitored and detected these incidents, providing early alerts to help schools and organizations stay informed.

We built an MCP server so Claude can access your incidents

"Show me all critical incidents from the last week." "Create an incident for the payment API being down." "What was the root cause of that database incident last Tuesday?" If you've ever wished you could just ask Claude (or any MCP client) to handle incident management tasks instead of context-switching between chat and your incident management dashboard, you're going to like what we built.

EMEA Rundeck by PagerDuty Meetup - July 2025

Join us for an informal 1-hour virtual event where the open-source Rundeck by PagerDuty community comes together to share automation stories and use cases. Whether you're new to Rundeck or looking to elevate your automation game, this meetup is packed with valuable takeaways for everyone! Host: Martin Van Son, Automation Specialist & Strategic Solution Advisor at PagerDuty New OSS Dashboards & Enterprise ROI Plugin + Creating Rundeck Plugins with Claude Code.

AMER Rundeck by PagerDuty Meetup - July 2025

Join us for an informal 1-hour virtual event where the open-source Rundeck by PagerDuty community comes together to share automation stories and use cases. Whether you're new to Rundeck or looking to elevate your automation game, this meetup is packed with valuable takeaways for everyone! Host: Forrest Evans (Director, Product Management at PagerDuty) Rundeck by PagerDuty: A Swiss Army Knife of Automation.

Incident Commander Role: Responsibilities and Best Practices

When a critical system goes down at 3 AM, the difference between a quick resolution and hours of costly downtime often comes down to one role: the incident commander. This person serves as the central coordinator during IT incidents, making crucial decisions that can save thousands of dollars per minute.

EU AI Act: what changes in August 2025 and how to prepare

‍ On August 2, 2025, a key part of the EU AI Act comes into force. It has serious implications for how you manage incidents related to artificial intelligence. ‍ While the full regulation will not apply until 2026, new obligations for providers of general-purpose AI (GPAI) models begin this summer. If you are building or deploying AI-powered services in Europe, the clock is ticking.

Why Monitoring Heartbeat Events with PagerDuty AIOps is the Future of System Health Tracking

Organizations migrating from Opsgenie and other legacy incident management platforms are discovering that basic connectivity monitoring isn’t enough for modern operations. While Opsgenie Heartbeats and similar traditional heartbeat features offer simple binary status checks of system availability, PagerDuty’s AIOps-powered approach transforms system health monitoring from reactive alerting into intelligent, automated operational intelligence.