How to Reduce MTTR With PagerDuty and Puppet's Relay
DevOps and SRE teams are under intense pressure to reduce the mean time to recovery (MTTR) when resolving incidents. With the proliferation of cloud services and the increasing complexity of DevOps toolchains, engineers today need to not only learn how to use these services, but also troubleshoot them when an incident is raised at 2 a.m. The problem is, many incident response processes are still manual today—cobbling together runbooks and ad hoc scripts and orchestrating people to respond.