Operations | Monitoring | ITSM | DevOps | Cloud

Alerting

Alert fatigue, part 3: automating triage & remediation with check hooks & handlers

In many cases — as you’re monitoring a particular state of a system — you probably know some steps to triage or in some cases automatically fix the situation. Let’s take a look at how we can automate this using check hooks and handlers.

6 Ways to Avoid the 'Swivel-Chair' Effect

When an incident occurs, do you shudder when either you or your team proceed to open multiple browser tabs for each of your monitoring tools? This is the picture painted by the “swivel-chair” effect, context-switching between tools to gather information needed to determine a path of resolution.

Incident Management (class SRE implements DevOps)

In the previous video, Liz and Seth discussed how to make systems observable and how observability helps us diagnose failing systems, but didn't cover what to do when an incident grows beyond the ability of one person to do it all. In this video, you learn about the most important part of the incident management process – humans.

Avert a Website Meltdown With These Awesome Features

Our primary focus at Uptime.com is creating a tool that can monitor every critical piece of infrastructure that drives the work you do. We created a series of checks to accomplish this task, with API and Transaction checks offering unprecedented flexibility. The next step was a mechanism for controlling how alerts were issued. The Advanced Check Options we’ll look at today are aimed at controlling when and how alerts are issued.

Survey reveals rapidly growing role of IT Service Alerting

In a survey conducted at Microsoft Ignite 2018 in Orlando, Florida, Derdack investigated the state of IT alerting solutions among businesses. The survey is based on 368 participants, randomly selected among IT professionals visiting the expo showfloor. The survey revealed if and if yes, which IT alerting solutions (ITSA / “IT Service Alerting”) businesses use to support their IT operations and to respond faster to major and critical IT incidents.

Determining Fair & Competitive On-Call Compensation

The key to choosing the best compensation plan is finding a solution that works well for your company but also recognizes the employees for their effort and time spent. If employees are well-cared for, they will, in turn, care about the business and contribute to its’ success. After choosing a method and confirming it abides by local laws, be sure to examine the following to confirm that compensation is competitive and fair and that your team can adequately share the load on-call requires.