Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

PagerDuty Operations Cloud Product Demo

Check out the PagerDuty Operations Cloud in action. It detects and analyzes event data from across your digital operations, automates infrastructure and workflows, and mobilizes the right team members to minimize the impact of disruptive events on customers, employees, and brand reputation. It will help your teams free up time, reduce operations costs so you can deliver seamless experiences for your customers.

PagerDuty External Status Pages

External Status Pages offer public audiences a unified source of truth about your infrastructure’s health. This feature can be customized to fit your brand’s look and feel, and you can define different views and sets of Business Services to display. Product Manager Jacky Leybman joins the stream to show off how customers can stay informed about ongoing incidents and read status updates, or subscribe to your status page to receive notifications via email.

The "people problem" of incident management

Managing incidents is already tricky enough, and you want to get to mitigation as quickly as possible. But sometimes it feels like organizing everything surrounding an incident is more difficult than solving the actual technical problem and you end up getting delayed or sidetracked during mitigation efforts. We call that scenario the “people problem” of incident management.

SIGNL4 Onboarding: Routing Alerts to Teams using Distribution Rules

The SIGNL4 Onboarding series walks users through the process's of SIGNL4 from Signup to Alerts to Settings. Today's video focuses on sending alerts to the right users via distribution rules. Learn how to create a distribution rules and to route alerts to different teams using criteria included in the events. This video is packed with helpful tips to help you get the most out of your account.

Squadcast Named Category Leader in IT Alerting by G2 | Squadcast

🚀Squadcast has been recognized by G2 as a Category Leader in the IT Alerting category! Backed by immense customer love, advanced features, and the highest possible scores 💯— Squadcast has made it to the Leader Quadrant! This video offers all the related updates!

Our lessons from the latest AWS us-east-1 outage

In case you missed it, AWS experienced an outage or "elevated error rates" on their AWS Lambda APIs in the us-east-1 region between 18:52 UTC and 20:15 UTC on June 13, 2023. If this sounds familiar, it's because it's almost a replay of what happened on December 7, 2021, although that outage was significantly more severe and took longer to restore.

Synthetic monitoring as Code with Checkly and ilert

This post will introduce Checkly, the synthetic monitoring solution, and their monitoring as code approach. This guest post was written by Hannes Lenke, the CEO, and co-founder of Checkly. ‍ First, thanks to Birol and the ilert team for the opportunity to introduce Checkly. ilert recently announced discontinuing its uptime monitoring feature and worked with us on an integration to ensure that existing customers could migrate seamlessly. ‍ So, what is monitoring as code and Checkly?

Top 5 Use Cases for Custom Fields on Incidents

Chasing down critical information in disparate systems of record while trying to resolve an incident can make an already stressful situation even more taxing. Extra clicks, extra logins, copy/paste, socializing that information with other responders–it all wastes time and introduces more room for human error. Now PagerDuty customers can use Custom Fields on Incidents to enrich their incident data.