Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

What's New: Updates to On-Call Management, Incident Response, Event Intelligence, Process Automation, and More!

We’re excited to announce a new set of updates and enhancements to PagerDuty’s Digital Operations Platform. Recent updates from the product team include On-Call Management and Incident Response, Process Automation, to PagerDuty Community & Advocacy Events. New capabilities enable users and customers to resolve incidents faster, do the following, and more.

PagerDuty: Event Intelligence for AIOps - Demo!

Noisy alerts and manual remediation can be things of the past. In this vidoe, learn about how your team can leverage Event Intelligence, a powerful AIOps solution from PagerDuty that helps teams harness machine learning to reduce alert noise, create context for faster resolution, and remove toil by automating repetitive tasks.

Six Stages of the Business Continuity Management Lifecycle

Business continuity is a crucial part of any scalable operations plan, but many businesses fail to realize how important it is until their first critical emergency. Only then does business continuity management come to the forefront of planning exercises, and stakeholders are forced to reflect on what went wrong, why it went wrong, and determine if they can avoid it happening again, or be better prepared if it does. The true business continuity management lifecycle begins long before an incident.

SRE vs. Platform Engineering: The Key Differences, Explained

Site Reliability Engineering (SRE) teams and Platform Engineering teams share similar goals -- like maximizing automation and reducing toil -- and similar methodologies. But they have different priorities, and use somewhat different tools to achieve them. What are SREs, what are platform engineers and how is each role similar and different? This article explains.

Five Considerations for Choosing Self-Managed Automation vs. SaaS Automation

Sometimes heritage is better than new. Some people favor Coca-Cola Classic over New Coke, and heirloom tomatoes over regular tomatoes. Some Luddites might say the same thing about cloud computing. “I won’t put my (app/data) in the cloud! It will be more (secure | reliable | cheaper) if I run it myself in my own data center.”

Flow Designer Overview - xMatters Support

xMatters low-code workflow builder, Flow Designer, lets you build and execute multi-step processes simply by dragging, dropping, and connecting steps. Steps perform an action in your resolution process. That action could be enriching data from an external tool, creating a ticket in a service desk, or sending an actionable notification. The possibilities are endless.

How to build a strong incident response process

When building an incident response process, it’s easy to get overwhelmed by all the moving parts. Less is more: focus first on building solid foundations that you can develop over time. Here are three things we think form a key part of a strong process. I’d recommend taking these one at a time, introducing incident response throughout your organisation. Just being honest: we’re a startup selling incident management software.

Rundeck + Squadcast Integration: Simplifying Alert Routing

Rundeck is an automation tool that helps to make existing automation, scripts, and commands more secure, auditable, and easier to run. It is a software Job scheduler and Run Book Automation system that automates routine processes across development and production environments. It brings together tasks scheduling, multi-node command execution, workflow orchestration. It also logs everything that happens in the system. Squadcast is an end-to-end incident response tool.