The latest News and Information on Service Reliability Engineering and related technologies.
Without SRECon happening this year and the world turned upside down from COVID-19, we set out to hold a virtual event to bring SREs together to share their experiences of what has changed. Last week’s SRE from Home was exactly that. With 1900 registrants, 20 lively Slack channels, six illuminating and entertaining talks from a diverse range of experts in the field and our #askanSRE panel answering attendees’ questions with a candid generosity, it was an amazing, jam-packed day.
Site Reliability Engineering (SRE) is a practice for managing the reliability of systems that began at Google in the early 2000s. Ben Treynor Sloss from Google started the first SRE team and coined the name.
We recently released Catchpoint’s SRE Report 2020 that analyzed results from the SRE survey we conducted early this year along with a recent addendum survey. The report offers a detailed look at the current state of SRE and how the shift to an all-remote work environment has impacted SRE teams. In this blog, we take a deeper look at one of the report highlights – ‘Heavy Ops Workload Comes at a Cost’.