Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Why we went passwordless on our new product

Passwords are dying. The cost of creating and maintaining passwords is becoming untenable. Which can be seen in the rise of users logging in with social products and developers outsourcing their pain to Auth0 and the likes. We decided to sidestep the password based authentication and went passwordless on our new product. Read on to see how you can go passwordless too.

Using OnPage to Deliver Exceptional Customer Support

The OnPage Customer Support team consists of knowledgeable, friendly technicians that offer 24/7 assistance. Support recognizes the importance of client relationships and always aims to achieve maximum customer satisfaction. The OnPage incident management system is at the center of Support’s quality service delivery. OnPage triggers instant, critical mobile alerts to technicians whenever customer-initiated tickets are created.

What is DevOps?

What is DevOps? DevOps is a term for a cluster of concepts that has become a movement, “a cross-disciplinary practice dedicated to the study of building, evolving and operating, rapidly-changing resilient systems at scale.” (Jez Humble) The definition of DevOps is not agreed upon by everyone because of the complex processes attached to the term, however, the benefits to teams are universally agreed upon.

SRE as Organizational Transformation: Lessons from Activist Organizers

In the software industry’s recent past, the biggest disruptive wave was Agile methodologies. While Site Reliability Engineering is still early in its adoption, those of us who experienced the disruptive transformation of Agile see the writing on the wall: SRE will impact everyone. Any kind of major transformation like this requires a change in culture, which is a catch-all term for changing people’s principles and behaviors.

Introducing Incident Timer

We’re excited to announce Incident Timer - a “days without an incident” timer for software teams to keep track of major engineering incidents. As the people behind Spike.sh, we keep discussing how to build a culture of reliability with our customers. We loved the idea of safety/accident timers in factories which kept track of major accidents. It's a simple and elegant way to keep safety on everybody’s minds.

Accelerate your logs investigations with Watchdog Insights

If you’re investigating an incident, every minute means degraded performance or even downtime for customers. The causes of an issue often come from parts of your systems and applications that you would not think to check, and the sooner you can bring these to light, the better.

SRE2AUX: How Flight Controllers were the first SREs

In the beginning, there were flight controllers. These were a strange breed. In the early days of the US Manned Space Program, most american households, regardless of class or race, knew the names of the astronauts. John Glen, Alan Shepard, Neil Armstrong. The manned space program was a unifying force of national pride. But no-one knew the names of the anonymous men and later, women, who got the astronauts to orbit, to the moon, and most importantly, got them back to earth.

6 incident management hacks to implement using ServiceDesk Plus

Ever wondered how enterprises like Zoho, with over 50 SaaS applications and more than 180,000 customers, handle the spectrum of IT incidents they face? Download this free e-book now to get an insider look into the incident response and management processes that Zoho has perfected over the years.

6 incident management hacks to implement using ServiceDesk Plus Cloud

Ever wondered how enterprises like Zoho, with over 50 SaaS applications and more than 180,000 customers, handle the spectrum of IT incidents they face? Download this free e-book now to get an insider look into the incident response and management processes that Zoho has perfected over the years.

What Our Customers Say About the PagerDuty Platform

As noted in this blog a couple of weeks ago, we recently commissioned IDC to interview PagerDuty customers to quantify the business value they gain from our platform. It found that, on average, the 14 PagerDuty customers interviewed gained annual benefits of $3.48 million, a three-year ROI of 795%, and a payback period of just over two months.