Operations | Monitoring | ITSM | DevOps | Cloud

Blameless

5 Exciting Predictions for SRE in 2023

SRE is a field defined by its constant evolution: from Google’s in-house secret recipe, to the hottest new practice for the biggest enterprise orgs, to a diverse and holistic mentality practiced by orgs of all sizes. Earlier this year, we co-sponsored the Catchpoint State of SRE survey, where we took the temperature of SRE where it was. Now, as we did in 2021 and 2020, we’ll turn to the future to speculate on what 2023 will bring for SRE. ‍

How to talk to your executive leadership team about reliability

Product reliability requires investment from all areas of the business. Technology leaders must effectively communicate the implications of service reliability to the rest of the organization. As a leader, how do you prove that a more reliable product is critical to success? Experts from BetterCloud, Machinify and Blameless come together to discuss how to talk to your executive leadership team about reliability in this webinar.

How to talk to your executive leadership team about reliability

Product reliability requires investment from all areas of the business. Technology leaders must effectively communicate the implications of service reliability to the rest of the organization. As a leader, how do you prove that a more reliable product is critical to success? Experts from BetterCloud, Machinify and Blameless come together to discuss how to talk to your executive leadership team about reliability in this webinar.

Incident Management Tools - Do I Even Need Them?

Software is hard… Maintaining software reliability is harder than it used to be. Software systems have grown dramatically in complexity, as they’re applied in a wider range of applications and environments. Many of which have become fundamental to the everyday function of our society. On the other hand, the pace of software development and release is also faster than ever. Innovating new features faster than competitors has become the key to success in a rapidly-changing market.

Failure Analysis: Engineering incidents are a bigger problem than you think

Engineering incidents can be quite harmful for companies, both in terms of financial costs and reputational damage. In some cases, engineering incidents can even put people's lives at risk, which can have serious legal and moral implications for the company involved.

SRE Maturity Model: How Do You Assess Your Team?

How do you evaluate your SRE team’s progress in implementing SRE? We discuss the key SRE indicators for evaluating your team’s progress in the SRE maturity model. ‍ What is the SRE maturity model? ‍ The SRE maturity model is a way of judging how far you are in implementing SRE principles. It is a method used by teams to understand where they ought to implement more SRE best practices to reach greater SRE maturity.

How to design an effective incident on-call program

If anyone on your team has paged a colleague in the middle of the night, your DevOps team has an incident on-call program. Whether that team member knew who to page, and felt comfortable sending the page, is indicative of your on-call program's effectiveness. Join Thai Wood, founder of Resilience Roundup, and Matt Davis, SRE Advocate at Blameless, to discuss: This webinar was recorded live on December 13, 2022.

Tag You're It: Organized, Configurable Tagging is a Must-do for Great Incident Analytics.

Wouldn’t it be nice to learn which parts of your service see the most incidents, or why one service experiences more Sev1 incidents than the others? It’s not always easy to see the full disruptive impact of an engineering incident. Even harder to see trends across incidents and over time. Developing incident insights that you can use to help guide and shape the way your team designs and operates your product takes time, careful consideration, team engagement and the right tooling.