Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

New Event Source - Website Monitoring

Enterprise Alert is constantly evolving to provide our customers with new ways to implement event sources and use new features. With version 9, several new features have been implemented that make it easier for customers to create alerts for specific processes and events. These include the new “Website Monitoring” event source.

Self-Service for Teams in Enterprise Alert

A few days ago I had an insightful conversation with one of our customers who inspired me to write this blog. He, like so many other customers, was facing the problem that his Enterprise Alert management overhead was increasing with each new team he added, as he had been managing resources such as event sources, notification channels and alert policies for the new teams as well. His question to us, therefore, was whether he could not also put these management tasks in the hands of the teams.

Understanding a Microsoft Service Outage

Maintaining business continuity when an issue arises has proven to be a challenge many organizations struggle with. A global pandemic being thrown into the mix in Q1 of 2020 (one that many businesses are still navigating through) introduced a new set of problems for both service providers and businesses reliant on those services.

Enhance NOC Alerts With Incident Management and Alert Automation

In a network operations center (NOC), alerts originating from hundreds of servers, application monitoring systems, emails and ticketing services compete to catch a NOC analyst’s attention. NOCs face many challenges in parsing through alerts to identify actionable notifications and mobilize the right response team into action.

Celebrities Explain WTF is Incident Management

Our friends Felicia Day, Steve Wozniak, and Brian Baumgartner help us explain what the heck incident management is. FireHydrant is the only comprehensive incident management platform that allows you to create consistency for the entire incident response lifecycle to focus on fighting fires faster. From alert to retrospective, tracking, communicating, and reporting on results: FireHydrant will automate the process so you can focus on resolution. Visit firehydrant.io to learn how you can manage the mayhem.

What is Opsgenie?

Opsgenie is an on-call and alert management and incident response solution to keep services always on. It empowers Dev and Ops teams to plan for service disruptions and stay in control during incidents. With over 200 deep integrations and a highly flexible rules engine, Opsgenie centralizes alerts, notifies the right people reliably, and enables them to collaborate and take rapid action.