Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Should you care about AIOps? Obviously.

There's a lot of hype in the marketplace about AIOps right now, and there's a lot of people who've got some interesting ideas about what it should be. The most common idea that I hear is that it's essentially a layer of AI magic that sits across everything that you've got in your IT tooling today and then make sense of all of that for you and then we'll decrease the number of incidents you have and reduce your MTTR...

Incident Management Process- 6 Tips to Better Prepare Your IM Process for The Holiday Season.

Holiday retail sales are likely to increase between 7% and 9% in 2021, according to Deloitte’s annual holiday retail forecast with holiday sales totaling $1.28 to $1.3 trillion during the November to January timeframe. Deloitte also forecasts that e-commerce sales will grow by 11-15%, year-over-year, during the 2021-2022 holiday season.

How Patient-Centered Care Improves Patient Outcomes

The patient-centered care (PCC) model enhances the way providers interact with patients during the care delivery process. Clinicians that show compassion and empathy toward patients are more likely to achieve meaningful, positive doctor-patient relationships. Indeed, care teams that prioritize PCC have a proven approach to improving patient satisfaction and increasing patient retention.

How Your ITSM Tool & PagerDuty Make a Dynamic Duo for Real-Time Work

There’s an incident. Your teams need to communicate with the development team that owns the service, but that team is too busy to stop and chat. Meanwhile, you in central IT have business leaders asking for updates, angry internal users calling the help desk, and customer service representatives asking for information. You have hundreds of tickets all pertaining to the incident in your ticketing system.

What SREs Can Learn from Facebook's Largest Outage

Facebook’s October 2021 outage was the type of event that gives SREs nightmares: A series of critical business apps crashed in minutes and remained unavailable for hours, disrupting more than 3.5 billion users around the world and costing about 60 million dollars. As incidents go, this was a pretty big one.

PagerDuty Integration Spotlight: Honeycomb

Honeycomb delivers observability for modern engineering and DevOps teams to observe, debug, and improve production systems efficiently. The PagerDuty + Honeycomb integration uses Honeycomb Triggers to notify on-call responders based on alerts sent from Honeycomb. This integration is maintained and supported by Honeycomb. Liz Fong-Jones from Honeycomb joined us live on Twitch to share more about how Honeycomb and PagerDuty can be used together to help your teams and to do some live investigation into Honeycomb’s own performance data.

FireHydrant expands Reliability Platform with Service Catalog

Today, we are happy to announce the launch of Service Catalog to help you better manage, query, and learn about the services that exist in your infrastructure. At FireHydrant, we envision a world where all software is reliable, and we’re on a mission to help every company that builds or operates software get closer to 100% reliability. Service Catalog helps you get closer to 100% reliability.

4 xMatters Use Cases That May Surprise You

xMatters is part technology, part service reliability, and a little bit of magic. If you’ve spent time on the xMatters website, you’ll likely have seen a number of valuable use cases for the platform—it can alert SREs when there’s a website outage, it can accelerate product development for DevOps teams, it can manage on-call schedules and alerts for support teams.

The Cost of Increasing Incidents: How COVID-19 Affected MTTR, MTTA, and More

Digital transformation accelerated for many companies during the last 18 months. While it may have been on the agenda prior to COVID-19, teams were pushed to extreme speeds to digitize and meet the rising online demand. During this time, organizations learned important lessons that they’ll carry on with them into this new future. Leaders can take these learnings and use them to build better products, healthier and more efficient teams, and a happier customer base.