Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Empower the SREs - Conclusions from The SRE Report 2023

Let's be honest, nobody loves surveys. Ok, well I sure don't. But surveys satisfy a huge need in our demand for insights into complex human-computer, sociotechnical systems. It turns out that we've been measuring the computer part pretty well, but the humans – not as easy to keep track of. When Google SRE first defined toil as a metric we wanted to reduce, we spent far too long trying to quantify it numerically based on tooling and insights from computer systems.

Building an incident management process - incident.fm

In this podcast, our panellists discuss the foundations that any team needs to put in place when designing their incident management process. Starting from the basics of defining what we really mean by an incident, to how to set your severity levels, roles and statuses, Chris and Pete share their tips for building solid foundations to run your incidents.

Building an incident management process

In this podcast, our panellists discuss the foundations that any team needs to put in place when designing their incident management process. Starting from the basics of defining what we really mean by an incident, to how to set your severity levels, roles and statuses, Chris and Pete share their tips for building solid foundations to run your incidents.

3 questions to ask in the build vs buy debate for incident response tooling

As a former incident responder and now as a responder advocate for FireHydrant, I’ve seen the “build vs. buy” debate play out many times. In fact, I even supported the tool that former employers used for managing incidents for years before they decided to buy (more on that in a future blog post).

Webinar: Real talk: automation for ITOps

IT operations move fast. If you’re an ITOps leader, you need to be moving just as fast to make sure your team has what it needs. Positioning your team for success isn’t easy: complexity in IT is increasing every year and can reach a point where it exceeds a person’s capacity to keep pace. In the face of massive growth, ITOps teams can face major challenges with productivity, burnout and efficiency.

[Report:] The true costs of modern IT outages

If you’re in IT, no doubt you’ve heard the age-old statistic that an average minute of downtime costs $5,600. It turns out that information is a bit outdated and does not reflect the real and nuanced costs of a modern IT outage. BigPanda suspected this and wanted to uncover the true numbers behind outage costs so ITOps can have a better understanding of costs, causes and “cures” of an IT outage.

AppExchange Mavericks: PagerDuty Empowers Customer Service Agents to Resolve Cases Quicker

Jonathan Rende, SVP of Products at PagerDuty and AppExchange Mavericks, Salesforce MVP Barb Dietz discuss how PagerDuty is working to empower customer service agents to resolve customer-impacting issues faster. BONUS: In this video, you will get a front-row seat to PagerDuty’s product demo. See what's in the video.

PagerDuty November 2022 Product Launch - Product Highlights Demo

Learn how PagerDuty's latest capabilities can help you solve critical, unplanned work faster in this new product highlights demo. Our host of new capabilities help you improve team productivity, avoid escalations, and optimize digital services. Features highlighted in this video are the following new PagerDuty features and more.