Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Uptime During the Holiday Shopping Season

In the United States, it’s almost that time of year again where we count our blessings and give thanks. For retail workers, it’s also that time of year where they prepare for the onslaught of eager shoppers who waited hours in line to run into stores to get their hands on doorbuster deals (sometimes knocking down the employees in the process).

Meet Opsgenie at AWS re:Invent 2018-Making Incident Response Faster and More Efficient

It’s an exciting time here at Opsgenie. We recently joined the Atlassian family, updated our logo, released new pricing, and now we’re headed to AWS re:Invent 2018! So much has changed since last year’s event and we can’t wait to talk about it in person.

What is MTTR? Critical Incident Recovery Metrics to Reduce Downtime

Whether it’s a scheduled maintenance or an unexpected outage, downtime is time your solutions are out of action and unavailable for use. Long or frequent periods of downtime have significant costs to the company, and ultimately undermine customer trust. So what is MTTR? And how can improving MTTR reduce downtime? Below are four key metrics to get you started.

Mass and Emergency Notifications versus Operational Alerting

The area of alerting has very much matured over the past 10 years and there is now a number of specialized vendors in both mass notifications and operational alerting. It can be very difficult to make the right choice for a specific business need. However, it is possible to identify these two main segments and to categorize alerting solutions accordingly for a better decision on fitness for purpose.

When Every Minute Matters

Human trafficking is a $150 billion dollar criminal industry that denies freedom to over 40 million people globally—and it happens in every country in the world. Polaris is an organization dedicated to ending human trafficking and restoring freedom to survivors. For over a decade, Polaris has operated the U.S. National Human Trafficking Hotline.

PagerDuty API Introduction

Learn how easy it is to get up and running with the PagerDuty API in just a few minutes. Harness automation in your incident response and digital operations by leveraging PagerDuty’s REST based API. This video covers basic concepts regarding APIs, REST and JSON. You will also be introduced to PagerDuty’s industry leading interactive API documentation that will automatically provide executable API code at your fingertips.

5 Best Practices for Resolving Errors Quickly

I love writing software, but I hate dealing with bugs. They take you away from what you want to be doing and often lead you into a rabbit hole. At Sentry—an open-source error tracking platform that provides complete app logic, deep context, and visibility across the entire stack in real time—we have a few tips that we’ve honed over time to make error resolution painless (ok, less painful), including an official integration with PagerDuty.