Major outages are bound to occur in even the most well-maintained infrastructure and systems. Being able to quickly classify the severity level also allows your on-call team to respond more effectively. Imagine a scenario where your on-call team is getting critical alerts every 15 minutes, user complaints are piling up on social media, and since your platform is inoperative revenue losses are mounting every minute. How do you go about getting your application back on track? This is where understanding incident severity and priority can be invaluable. In this blog we look at severity levels and how they can improve your incident response process.
Imagine being part of an overactive group chat that causes your phone to buzz every few minutes. In the beginning, you open every message but soon realize that most of them aren't important-or at least are not relevant to you. So, what do you do next? Maybe you let the messages pile up and check them later. Or perhaps, you mute the group chat and ignore the incoming messages altogether. You can blame this tendency to ignore or avoid incoming messages or notifications on one culprit: alert fatigue.