Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

The why and how behind running incident response game days

In any high pressure situation, the key to fast action is preparedness. And that’s true when it comes to incidents, too. Documenting and training your team on your incident response processes is essential to ensuring a coordinated and efficient response effort. And training sessions, or game days, as they’re sometimes called, are one way to get everyone up to speed.

Introducing PagerDuty AIOps: Harnessing the Power of AI to Transform Modern Operations for the Enterprise

Today, PagerDuty launched a new AIOps solution to leverage the power of AI, provide built-in automation and build on the company’s foundation data model to transform modern operations for the enterprise. PagerDuty has long suppressed noise to help distributed development teams focus.

Four Years as a Public Company

Four years ago tomorrow, our team rang the bell to open the NYSE for PagerDuty’s IPO. We spent two weeks traveling to meet hundreds of prospective investors in person, sustained by a diet of Cheetos and green M&Ms, sneaker-clad walks to meetings, and unwinding with bad karaoke. We’ve grown in many ways in our first four years as a public company. We have more than doubled the number of customers on the PagerDuty platform, and nearly tripled the number of users.

How to enrich IT alerts and add context with Data Engineering

I see it daily in my role, IT organizations are paying for best-of-breed monitoring tools but struggle to tie the pieces together between these siloed systems. The wound of these silos is further punctured when incidents arise. Incidents are costly for so many reasons, like wasted company resources, potential revenue loss, customer satisfaction, employee burnout, etc. This is exactly why BigPanda exists, to apply AI to the complex problems IT operations, NOC, SRE, and DevOps teams face daily.

Incident Response Playbook

In today's digital age, IT departments play a crucial role in maintaining the overall functionality and security of an organization. One essential tool for managing service outages and downtime is the incident response playbook. This comprehensive guide provides IT departments with the necessary processes and strategies to resolve incidents in a timely and efficient manner.

Time to Upgrade? Why Traditional Pagers Are No Longer Enough

When it comes to time-sensitive events, instant, reliable communication is key. In the past, pagers were relied on for quick communications as they allowed people to communicate on the go and without access to a landline. But today, the availability of cellphones has made the portability of communication devices a standard feature, and communication technology has advanced significantly, begging the question – What is the use for pagers today?