Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Designing smarter on-call schedules for faster, calmer incident response

When an incident wakes your team early in the morning, the last thing you want is confusion about who’s responding or how help will arrive. An effective on-call schedule doesn’t just get the right person online. It helps them stay calm, confident, and capable of solving problems quickly. Done right, your on-call setup becomes a powerful lever for reducing Mean Time to Acknowledge (MTTA), Mean Time to Resolve (MTTR), and the overall stress that incidents place on your team.

Why you should embrace more incidents (seriously!)

We’re all looking for ways to improve on our incident response. We investigate various metrics and methodologies—all in the name of making sure our customers see the reliable and performant systems we’ve sought to build. In fact, all these efforts are leading us, as an industry, to finally realize the power of surprising anomalous events in our systems. They give us an opportunity to reexamine our expectations and see how our models of the sociotechnical system differs from reality.

incident.io raises $62M in Series B fundraising

00:00 We're thrilled to share that Incident.io has raised $62 million in our Series B, led by Insight Partners.

00:11 Four years ago, we were three people around a kitchen table. Today, we're a team of 80 with thousands of teams using our platform to solve over 250,000 incidents a year. Whether you're streaming Netflix or buying something on Etsy, chances are our platform helped resolve the incidents behind the scenes.

Opsgenie alternative: How to migrate to Grafana Cloud IRM

In recent years, we’ve seen many organizations migrate from legacy incident response tools to Grafana Cloud IRM — our unified incident response and on-call management application hosted on Grafana Cloud — as they look to improve reliability, reduce costs, and consolidate their tooling. To help guide those efforts, we offer several IRM migration tools that allow you to more seamlessly migrate away from those legacy solutions and start using Grafana Cloud IRM.

Top 5 Incident Response Platforms for 2025

An incident response platform helps organizations manage, track, and resolve IT incidents quickly and efficiently. With the right platform, teams can minimize downtime, reduce the impact of incidents, and improve overall response times. ‍ In this article, we’ll explore the top 5 incident response platforms for 2025, helping you choose the best solution for your needs. ‍

Squadcast Strengthens Its Leadership in IT Alerting and Incident Management in the G2 Spring Report

2025 has already started out to be a remarkable year for Squadcast—with our key wins in the G2 Spring Reports, our acquisition by SolarWinds, and a series of impactful product releases and improvements. Our mission has always been clear: to deliver a unified platform that seamlessly integrates On-Call Management and Incident Response, empowering teams to boost service reliability and productivity—all without the burden of context switching.

Opsgenie Is Sunsetting: What to Look for in an Alternative

Atlassian is retiring Opsgenie, and if you're one of the teams relying on it to manage on-call and incidents, you're facing a tough question: Do you make the forced migration to Jira Service Management or Compass, scramble for a lookalike tool — or use this moment to upgrade your entire approach to incident response? If you’re facing that decision, we get it. Changing tools midstream isn’t ideal (to say the least). But it’s also a rare opportunity to take a meaningful step forward.

Metrics That Matter: Measuring Developer Productivity in the AI Era

In this episode, Ryan McDonald is joined by Mark Quigley, Head of Platform Engineering at Ninety.io, for a conversation that cuts through the noise around developer productivity metrics and AI. Mark dives deep into how teams can measure what matters—without falling into the trap of turning every measure into a target. He shares how tools like Developer NPS, DORA metrics, and balanced scorecards can help teams optimize for both output and well-being—but only when framed with the right intent.

The timeline to fully automated incident response

We speak to engineering teams every day, and everybody knows AI is the future. Some tell us they’re massively accelerated by Claude, or that they’re rebuilding their product, team and ways of working. Cursor and Lovable have announced they’re building the last piece of software. Should we give in to the vibes? Embrace exponentials, and forget that the code even exists? The reality is that things will still go wrong. They always do, at least from time to time.