I love writing software, but I hate dealing with bugs. They take you away from what you want to be doing and often lead you into a rabbit hole. At Sentry—an open-source error tracking platform that provides complete app logic, deep context, and visibility across the entire stack in real time—we have a few tips that we’ve honed over time to make error resolution painless (ok, less painful), including an official integration with PagerDuty.
At many points in a hospital’s functioning, workflow touches the outcome. The problem facing much of healthcare though is that the established workflow for alerting and messaging physicians is broken. What are ways for improving scheduling doctors? What are the potential impacts from improvement?
You’ve just recovered from a critical application outage and your team is being asked to report on root cause and recommended remediation steps later this afternoon. Can you quickly analyze all the data, identify all the leading events, and discern which one was responsible for the cascading failure?
By now you’ve learned about reducing the sheer amount of alerts you’re getting as well as automated triage and remediation. In this post, I’ll go into some extra steps you can take to further fine tune Sensu and cut down on alert fatigue.