Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

PagerDuty Incident Response Training (Summit Series Chicago 2017)

Dec 4, 2018 By PagerDuty In PagerDuty

Incident Response Training @ PagerDuty Summit Series Chicago, September 27th, 2017

View Video

PagerDuty

Read more about PagerDuty Incident Response Training (Summit Series Chicago 2017)

Nexthink - Incident Reduction

Dec 4, 2018 By Nexthink In Nexthink

As organizations increasingly face an unprecedented volume of IT-related incidents, Nexthink focuses on detecting and correcting issues at their source. The result? Less IT disruption for your employees, lower IT support costs and mitigated risk associated with enterprise-wide application breakdowns.

View Video

Nexthink

Read more about Nexthink - Incident Reduction

Getting the Most out of Atlassian and Opsgenie

Dec 3, 2018 By OpsGenie In Opsgenie

Opsgenie is now part of the Atlassian family and we're excited to invite you to this exclusive webinar where we will demonstrate how to integrate Opsgenie's powerful alerting and on-call management tools with your entire Atlassian stack.

View Video

Opsgenie

Read more about Getting the Most out of Atlassian and Opsgenie

How to Raise the Bar on the ITIL's Recommendations for Critical Incident Management

Dec 3, 2018 By Noam Morginstin In Exigence

According to the ITIL, the framework of best practices for delivering IT services, there is a recommended process flow for how to handle major incidents. Clearly, the IT community would be well served to follow the ITIL’s systematic and professional approach, whose benefits, according to CIO Magazine.

Read Post

Exigence

Read more about How to Raise the Bar on the ITIL's Recommendations for Critical Incident Management

Incident Response with AWS Systems Manager

Nov 29, 2018 By Kevin Landt In Opsgenie

The typical DevOps on-call engineer is responding to alerts, triaging based on service impact, troubleshooting high priority incidents, and taking action to remediate issues. Automation tools like AWS Systems Manager can be a big help in reducing some of the more repetitive work and allowing engineers to focus on the most important tasks.

Read Post

Opsgenie

Read more about Incident Response with AWS Systems Manager

Can You Trust Machine Learning In IT Operations?

Nov 29, 2018 By BigPanda In BigPanda

Chronically understaffed and constantly stressed-out IT Ops and NOC teams are overwhelmed by today’s IT noise. Artificial Intelligence (AI) and Machine Learning (ML) can help these teams because ML (and AI) are exceptionally good at processing enormous volumes of very complex data in real-time, or near real-time, and surfacing actionable insights.

View Video

BigPanda

Read more about Can You Trust Machine Learning In IT Operations?

IT Operations Can Trust Open Box Machine Learning From BigPanda

Nov 29, 2018 By BigPanda In BigPanda

Open Box Machine Learning from BigPanda is a revolutionary approach to machine learning in IT Operations.

View Video

BigPanda

Read more about IT Operations Can Trust Open Box Machine Learning From BigPanda

Open Box Machine Learning From BigPanda Has a 100% Success Rate

Nov 29, 2018 By BigPanda In BigPanda

With BigPanda’s Open Box Machine Learning, teams can see the ML logic in plain English, they can edit the logic, test it and run what-if experiments, and they can add situational and tribal knowledge that strengthens this logic.

View Video

BigPanda

Read more about Open Box Machine Learning From BigPanda Has a 100% Success Rate

Reduce IT downtime with incident management

Nov 29, 2018 By Shawn Lazarus In OnPage

In the IT world, if a server can fail or traffic can overload the network – it will. And the consequences of downtime are significant. Many IT organizations face database, hardware, and software downtime that last short periods or can shut down the business for days. According to Gartner, the average cost of network downtime alone is $5,600 per minute. What measures can organizations take to reduce IT downtime?

Read Post

OnPage

Read more about Reduce IT downtime with incident management

Announcing postmortems for Jira Ops

Nov 27, 2018 By Blake Thorne In Atlassian

Major incidents are inevitable, and fixing them is the top priority for any ops or DevOps team. But what happens after service is restored? Do teams take the time to fully understand what went wrong, then follow up to prevent it happening again?

Read Post