Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Importance of 24×7 Communications in a Legal Setting

At OnPage, we communicate and interact with legal practices, understanding how their needs are changing, while helping them transform their operations. Through these conversations, we learned that an increasing number of practices look to transition to 24×7 after-hour communications, allowing potential and existing clients to reach legal practitioners (i.e., lawyers) at any time.

Incident Response 2.0 - The Zenduty Incident Command System(ICS)

We are super excited today to introduce our latest Zenduty integration with Slack, which we are calling the Zenduty Slack Incident Command System(Slack-ICS). This was many months in the making and went through multiple iterations and it is something we believe will redefine proactive incident management and response.

Five Ways How OnPage Reduces Physician Burnout

Physicians surveyed listed administrative tasks as a key cause of frustration and physician burnout. Accordingly, this results in workplace issues including, delays in communication, mishaps in surgery and monitoring failures. Fortunately, OnPage curated its HIPAA-compliant, clinical communications solution to help reduce physician burnout and improve care team collaboration, preventing dire consequences in the process.

Building Reliability Through Culture with Veteran Google SRE, Steve McGhee

Which of the following three scenarios do you experience the most when a new incident occurs? For many teams, incidents unfortunately fall into scenario 1, with some classes of incidents catching them by surprise. It’s astonishing that despite the vast amount of time we spend working on and thinking about our systems, we seem to have very little control over them. If we can’t predict where the next incidents will come from, then we will be forever stuck in a reactive cycle of repair.

Incident Alert Routing - reducing noise and getting woken up only by alerts that matter

Site reliability engineers have one of, if not the, toughest roles in any organization. While dealing with incidents is one part of the job, the other is to build reliable systems. Google’s SRE book sums this approach nicely. One of the most important challenges for an SRE when it comes to balancing work between firefighting and toil reduction is the issue of alert noise.