The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.
Like death and taxes, IT incidents are inevitable. Issues like server outages and broken code are common—and costly. A single hour of downtime costs businesses more than $300,000 on average, according to Gartner. That’s why a solid incident management strategy is a must for any organization. “People solve incidents, but we can’t do it alone,” says Ali Rayl, Slack’s vice president of customer experience.
If you are still handing over a shared on-call duty phone or pager (sometimes called ‘operations phone’), it is time to rethink your process. The Covid19-induced new normal has a dramatic impact on our work live and social behavior. We work from home and that is especially true for the IT workforce. We meet with less people and limit our social network to relatives and close friends.
You might have noticed that we’ve added a new type of alert source a few months ago - Heartbeat alert sources: A Heartbeat alert source expects a signal (the “heartbeat” ping) at regular intervals and alerts you, if it doesn’t receive a ping within the specified interval.
In the following years, U.S. industries are poised to experience a changing of the guard. The majority of baby boomers will retire in the next decade. Their roles will be taken over by millennials (Generation Y), a digitally native generation that is familiar with modern technology. Generation Y must develop empathy and prepare for the challenge of bringing tech disruption to the workplace. Millennials must introduce new technologies, without intensifying the anxiety of skeptical care providers.
The ability to detect and alert performance issues quickly is key to reducing the Mean Time to Resolve (MTTR). Proactive monitoring will catch incidents early on but triggering the right alerts and notifying the relevant incident management team is just as critical. Enterprises rely on multiple disparate tools to monitor different systems so there is a lot of data and noise generated which can render incident management inefficient.
We are super excited to share that we are currently testing and in the process of rolling out a new desktop global navigation to all of our users. Things that are clear in retrospect often emerge from ambiguous and humble beginnings. Initially built as a simple on-call management tool for IT responders, PagerDuty has evolved into an end-to-end, enterprise-grade digital operations platform.
The ability to automate your incident response process means you can start responding to incidents faster. So it’s easy to see why FireHydrant Runbooks is so popular within the platform. When you let automation take over, you can spend more focus fixing problems and keeping your customers happy. Now with the addition of conditions, you can create even more powerful automation.
Our release of conditions in FireHydrant Runbooks has made it easier for teams who rely on email to communicate with key stakeholders or a distribution list. 💡If your team uses Slack, and you haven’t already installed our Slack integration, you should definitely check it out as it’s the easiest way to automate updates to channels when the status of an incident changes.