Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

AWS: Operations Health and Best Practices

The ITOps world is a harsh working environment where ITOps personnel are expected to minimize the business impact of incidents at all hours of the day—regardless of the impact to themselves or their families. As more companies undergo digital transformation, the number of alerts and interruptions flowing to IT first responders will continue to increase.

PagerDuty Launches New AWS Integrations for CloudWatch, GuardDuty, CloudTrail, and Personal Health Dashboard

As you may expect from a company founded by former Amazon employees, PagerDuty has been helping AWS users automatically turn any signal into the right insight and action for years. Our Amazon CloudWatch integration enables teams to proactively mitigate customer-impacting issues, which in turn allows organizations to innovate and scale both their AWS and hybrid environments with confidence.

Uptime During the Holiday Shopping Season

In the United States, it’s almost that time of year again where we count our blessings and give thanks. For retail workers, it’s also that time of year where they prepare for the onslaught of eager shoppers who waited hours in line to run into stores to get their hands on doorbuster deals (sometimes knocking down the employees in the process).

Meet Opsgenie at AWS re:Invent 2018-Making Incident Response Faster and More Efficient

It’s an exciting time here at Opsgenie. We recently joined the Atlassian family, updated our logo, released new pricing, and now we’re headed to AWS re:Invent 2018! So much has changed since last year’s event and we can’t wait to talk about it in person.

What is MTTR? Critical Incident Recovery Metrics to Reduce Downtime

Whether it’s a scheduled maintenance or an unexpected outage, downtime is time your solutions are out of action and unavailable for use. Long or frequent periods of downtime have significant costs to the company, and ultimately undermine customer trust. So what is MTTR? And how can improving MTTR reduce downtime? Below are four key metrics to get you started.

Mass and Emergency Notifications versus Operational Alerting

The area of alerting has very much matured over the past 10 years and there is now a number of specialized vendors in both mass notifications and operational alerting. It can be very difficult to make the right choice for a specific business need. However, it is possible to identify these two main segments and to categorize alerting solutions accordingly for a better decision on fitness for purpose.

When Every Minute Matters

Human trafficking is a $150 billion dollar criminal industry that denies freedom to over 40 million people globally—and it happens in every country in the world. Polaris is an organization dedicated to ending human trafficking and restoring freedom to survivors. For over a decade, Polaris has operated the U.S. National Human Trafficking Hotline.