Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Decoding Severity: A Guide to Differentiating Major vs Critical Incidents

Recognizing the difference between major and critical incidents is essential for IT operations, as downtime can result in significant financial losses for businesses. Gartner highlights that effective incident management can cut downtime by as much as 40% . Major incidents disrupt business operations but are typically confined to specific systems or processes.

Round Robin escalation policies: do's and don'ts

The concept of Round Robin comes from sports. And it has nothing to do with anyone called Robin, but the french word ruban (ribbon). In a Round Robin tournament, all participants face each other by taking turns. When applied to on-call schedules, a Round Robin escalation policy means that responders assigned to a level will take turns responding to alerts. When is this strategy useful and when isn’t?

Behind the scenes: Launching On-call

March 5th was a big day for incident.io as we released our on-call product to the world. Nine months of listening to our customers, coding, fixing, testing, and polishing came together for our biggest product launch to date. Releasing On-call was a huge milestone and represented the next step in our journey as a company.

Align ServiceOps with incident context to meet ITOps goals

ServiceOps is a technology-enabled approach that unifies IT operations and IT service management (ITSM) teams to improve incident management. In a recent survey of more than 400 global IT leaders by Enterprise Management Associates (EMA), 96% of respondents reported positive results from implementing the approach. Adoption rates are high: 75% have either an active effort or a formal initiative to streamline collaboration between ITSM and ITOps teams.

Part I: #3 Virtual Meetup Rundeck by PagerDuty Asia Pacific OSS Community.

Part I:#3 Virtual Meetup Rundeck by PagerDuty Asia Pacific OSS Community. Customer Success Story: Samuel Kanagaraj (SRE Lead @ Telstra). Automate with Rundeck by PagerDuty! Explore the transformative power of automation through real-world success stories and expert insights. Hear firsthand from Samuel Kanagaraj, SRE Lead at Telstra, as he shares how automation has revolutionised their operations.

Part II: #3 Virtual Meetup Rundeck by PagerDuty Asia Pacific OSS Community.

Part II:#3 Virtual Meetup Rundeck by PagerDuty Asia Pacific OSS Community. Customer Success Story: Jared Vern & Christopher Gadd (Automation Engineers @ One New Zealand). Automate with Rundeck by PagerDuty! Explore the transformative power of automation through real-world success stories and expert insights. Jared Vern and Christopher Gadd, Automation Engineers at One NZ, discuss their experiences and the impact of automation on their workflows.

Onboarding yourself as an engineer at incident.io

At incident.io we use infrastructure as code for configuring everything we can, and we feel that there’s no reason we should exclude our own product from that. As well as configuring things like Google Cloud Platform, Sentry and Spacelift via our infrastructure repo, we also configure incident.io. On your first day as an engineer here, the first PR that you make is to our infrastructure repo.

Runbooks vs Playbooks: Differences & How to Choose

Are you documenting your incident response process, and are unsure which you should be writing—a runbook or a playbook? Could these be two names for the same kind of document? Read on to learn about two different and complementary structures: playbooks and runbooks. The two are used in tandem, and because the terms are sometimes used interchangeably, they can be mistaken for one another.

Live Call Routing with Squadcast: Helping Teams Achieve Faster Resolutions

This is a recording of our webinar on how Squadcast's Live Call Routing is revolutionizing incident response for teams. In this informative session, you'll learn: The hidden costs of traditional incident reporting methods How a dedicated phone line streamlines incident communication Squadcast's easy-to-use, no-code setup for Live Call Routing Real-world case studies: See how companies have drastically improved their MTTR About Squadcast.