Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Incident Review - AWS Outages Crash Major Online Services - Including Amazon

The following is an analysis of the Amazon Web Services incident on 12/07/2021. Millions of users were affected by an Amazon Web Services outage that took down major online services such as Amazon, Amazon Prime, Amazon Alexa, Venmo, Disney+, Instacart, Roku, Kindle, and multiple online gaming sites. The outage, which originated in the US-EAST-1 region on Dec. 7, 2021, is still ongoing at the time of blog publication.

Space Made Simple: How PagerDuty Enabled Loft Orbital to Achieve Incident Response Lift Off

The next great space race is on. Today, there are multiple companies competing to earn their slice of a global space industry set to be worth more than $1 trillion by 2040. However, launching a satellite into space still isn’t an option for most organizations due to the prohibitive costs and complex engineering required.

Why automation is the incident response 'easy button' MSPs & IR firms have been waiting for

The managed security services market is booming. Coming in at $22.8 billion in 2021, it is projected to nearly double in just five years and grow to $43.7 billion by 2026. Moreover, cloud-based managed security services are poised to be the major growth driver for the broader MSP market, coming in at $219.59 billion in 2021, and expected to reach $557.10 billion by 2028. As we can see, providing robust security services is a key competitive differentiator for the lucrative MSP market.

What's New: Updates to Runbook Automation, Event Intelligence,Partner Integrations, and More!

We’re excited to announce a new set of updates and enhancements to the PagerDuty platform. The product team has been hard at work making updates from Event Intelligence, Runbook Automation, and Applications with Monitoring Tools, to PagerDuty and PagerDuty Community Events.

Reimagining Retail Incident Response for the Holidays

The holiday season is here, and global retailers are prepared for the biggest retail event of the year. The decrease in new COVID-19 cases, coupled with a rise in vaccination rates, provides a glimmer of hope for shoppers looking to spend for friends and family. Holiday spending is expected to break previous records this year, growing up to 10.5 percent over 2020.

The Cultural Shift to Modern IT Operations

In the world of always-on services, many organizations have taken the path to modernize their IT operations to provide greater agility, lower cost, and most importantly, to deliver frictionless digital customer experiences. Is your DevOps team deploying more frequently than operations can support? Are you struggling to keep up with the maintenance issues associated with aging software? Modernizing your IT operations can be the key to overcoming these complexities.

Dashboard Fridays: Sample PagerDuty Alerting dashboard

Adam Kinniburgh is back with another Dashboard Fridays episode, this time joined by Ashley Thompson as they showcase this example PagerDuty Alerting dashboard. This dashboard gives an overview of alerting sent to PagerDuty from any source, even external sources like Pingdom.