Better root cause analysis: Mastering alert insights with the new central history timeline
A year ago we rebuilt our alert rule state history, using Grafana Loki for storage and updating the UI to display a timeline of all state changes of an alert rule. As a result, users can now conduct better root cause analysis by going down to the level of an alert rule and seeing when certain alert instances started or stopped firing. But we aren’t stopping there. To ensure system stability and avert outages, you also need one place to see the state history for all the alerts in your system.