Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

OnPage Atlassian Jira Service Management Integration

OnPage + Jira: Instantly Alert and Mobilize Your On-Call Teams Say goodbye to missed high-priority tickets! With the OnPage-Jira integration, critical Jira issues instantly trigger alerts to your on-call teams via the OnPage mobile app—ensuring fast response and accountability. What this integration offers: Instant alerts for critical Jira tickets Two-way communication between OnPage and Jira.

Pager fatigue: Making the invisible work visible

As much as you try to prevent it, your product will break sometimes. While you hope it would have the decency to do so while you are awake and already working, sometimes the product is inconsiderate and decides to break outside your office hours. Being woken up from a page at 3 am sucks, and being woken up again two hours later (when you get pinged for a follow-up issue you missed the first time) sucks even more.

Demo Roundups! Identifying System Weaknesses to Improve Resilience

How do you proactively identify weaknesses before they lead to costly incidents? Find out how PagerDuty empowers teams to uncover vulnerabilities, streamline incident response, and enhance operational performance to build more resilient systems. Host: Mandi Walls, DevOps Advocate at PagerDuty Guests: Alex Nauda, CTO Nobl9; Rich Lafferty, Principal SRE at PagerDuty.

Operational excellence in the age of AI and Automation

The future of operations is here with PagerDuty's groundbreaking AI and automation innovations. Learn how PagerDuty AI agents, powered by PagerDuty Advance, and new use cases like security incident management and LLMOps can help your organization achieve operational excellence to reduce cost, mitigate the risk of outages, and accelerate innovation.

War rooms? Finger-pointing? We can help you.

Say goodbye to late-night firefighting and endless finger-pointing. Explore how Catchpoint helps eliminate the need for “war rooms” by giving teams the visibility and insight they need to detect, diagnose, and resolve internet performance issues—before they impact users. Learn how Internet Performance Monitoring (IPM) empowers IT, SRE, and DevOps teams to: Pinpoint root causes across the entire internet stack Collaborate effectively across teams and vendors Proactively prevent outages and performance degradation Replace reactive chaos with data-driven confidence.

Transforming the Incident Lifecycle With AI Agents

We’re in the midst of a fundamental shift in how organizations run operations. 51% of companies have already deployed AI agents. What was once reactive and manual is becoming intelligent, automated, and AI-driven. The organizations that embrace this shift gain more than just operational efficiency; they develop a strategic competitive advantage that directly impacts business outcomes.

xMatters Zaxxon Release

Incident management can sometimes feel like piloting a spaceship through enemy fortresses while trying to hit as many targets as possible without, you know... game over. But, even if your response processes don't quite involve pixelated robots and laser beams like in the video game, Zaxxon, our latest release is here to make sure your feet stay firmly on the ground whatever incidents may appear in your stratosphere! Let’s take a look...

How to Combat MSP Alert Fatigue

Managed service providers (MSPs) are responsible for monitoring hundreds or even thousands of devices, meaning that they must have a practical way of identifying incidents, vulnerabilities, and outages. The obvious choice is employing an incident alerting tool that can deliver alerts to the on-call engineers responsible for maintaining system health and performance.