Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

How communication can make or break your incidents - incident.fm

In this episode, Pete and Lisa discuss why great communication is essential to the success of any incident management process. From keeping your wider team in the loop to minimise disruption, to using customer communication to strengthen your brand when things go wrong, the team share their experiences and top tips for having a transparent incident communication culture.

PagerTree Broadcasts

PagerTree broadcasts are a great way to send mass messages to multiple teams or users (think of an all hands on deck situation). When using the broadcasts feature you can send one way messages and optionally request a response. PagerTree intelligent on-call alert routing gives teams flexible schedules, escalations, & reliable notifications via email, SMS, voice, chatbots, & smartphone app.

How JPMorgan Chase uses Grafana and AI to monitor SLOs, SLIs, and more

For the team at JPMorgan Chase, the daily stakes of having a stable system are high. “We are in the business of making sure that trades are executed, and systems are stable and up and running for a positive client experience,” said Askari Imam, VP, Asset Wealth Management (Product and Integration Delivery).

A better way: 3 incident response areas prime for automation

By automating some rote parts of incident response, you reduce decision fatigue and help responders get to solving the problem faster with less stress. In this post, we talk about three areas of the incident response process that are prime for automation.

Identify and resolve incidents faster with InsightFinder's offering in the Datadog Marketplace

InsightFinder is a SaaS platform that uses AI-backed predictive analytics to predict and prevent production incidents. Using InsightFinder with Datadog, you can quickly identify hidden correlations in your application metrics, logs, and events and address application issues before they devolve into production outages and create customer impact.

Gartner IOCS Blog - Lucid Motors Case Study

Assaf Resnick, CEO and co-founder of BigPanda, sat down with Sanjay Chandra, vice president of information technology at luxury electric automaker Lucid Motors, at Gartner IT IOCS 2022. They discussed Lucid’s unique ITOps journey and how BigPanda helps minimize downtime of critical applications and services. Sanjay is a visionary ITOps leader, responsible for IT, enterprise systems, global infrastructure, operations and security at Lucid Motors.

What is Automated Diagnostics? How to reduce escalations and accelerate resolution with automation

Join PagerDuty’s Jake Cohen (Senior Product Manager) with RedMonk’s Kelly Fitzpatrick for a conversation and demo on automated diagnostics, process automation, and incident response. It’s all about automation helping first responders determine if there is an issue, which domain experts (if any) should be brought in to assist, and resolving the issue as quickly as possible.

PagerDuty and RedMonk Present: What is Automated Diagnostics? Part 1 - Use Case

Join PagerDuty’s Jake Cohen (Senior Product Manager) with RedMonk’s Kelly Fitzpatrick for a conversation and demo on automated diagnostics, process automation, and incident response. It’s all about automation helping first responders determine if there is an issue, which domain experts (if any) should be brought in to assist, and resolving the issue as quickly as possible. Part 1 of this 2-part video focuses on the concept and use case of automated diagnostics.

PagerDuty and RedMonk Present: What is Automated Diagnostics? Part 2 - Demo

Join PagerDuty’s Jake Cohen (Senior Product Manager) with RedMonk’s Kelly Fitzpatrick for a conversation and demo on automated diagnostics, process automation, and incident response. It’s all about automation helping first responders determine if there is an issue, which domain experts (if any) should be brought in to assist, and resolving the issue as quickly as possible. Part 2 of this 2-part video focuses on the concept and use case of automated diagnostics.