SRE

The latest News and Information on Service Reliability Engineering and related technologies.

Blameless SRE Resilience Panel

Oct 2, 2020 By Honeycomb In Honeycomb

View Video

Honeycomb

Read more about Blameless SRE Resilience Panel

SRE Leaders Panel: Testing In Production at Blameless

Oct 2, 2020 By Honeycomb In Honeycomb

View Video

Honeycomb

Read more about SRE Leaders Panel: Testing In Production at Blameless

Ask an SRE Panel Talk

Oct 2, 2020 By Honeycomb In Honeycomb

Our SRE Leaders Panel series gathers leading minds in the SRE and resilience community to share their insights. In this edition, we are so excited to have an amazing all-women panel who will be diving deep into testing in production: The event will consist of 40 minutes of roundtable discussion with Shelby and Talia facilitated by Blameless' Staff SRE Amy Tobey, followed by 20 minutes of Q&A from the audience. This is an open and candid discussion so come with your questions. We look forward to seeing you there!

View Video

Honeycomb

Read more about Ask an SRE Panel Talk

This is your Guide for Implementing SRE in NOCs

Oct 1, 2020 By Emily Arnott In Blameless

Network Operation Centers, or NOCs, serve as hubs for monitoring and incident response. A NOC is usually a physical location in an organization. NOC operators sit at a central desk with screens showing current service data. But, the functionality of a NOC can be distributed. Some organizations build virtual NOCs. These can be staffed fully remotely. This allows for distributed teams and follow-the-sun rotations. NOC as a service is another structure gaining in popularity.

Read Post

Blameless

Read more about This is your Guide for Implementing SRE in NOCs

SREview Issue #5 September 2020

Sep 15, 2020 By Blameless Community In Blameless

Here’s the September issue of SREview! This monthly zine features epic Tweets, content, and events happening in the SRE and resilience engineering community.

Read Post

Blameless

Read more about SREview Issue #5 September 2020

SRE Leaders Panel: Testing in Production

Sep 11, 2020 By Blameless Community In Blameless

Blameless recently had the privilege of hosting some fantastic leaders in the SRE and resilience community for a panel discussion. Our panelists discussed testing in production, how feature flagging and testing can help us do that, and how to get managers to be on board with testing in production. The transcript below has been lightly edited, and if you’re interested in watching the full panel, you can do so here.

Read Post

Blameless

Read more about SRE Leaders Panel: Testing in Production

SRE + Honeycomb: Observability for Service Reliability

Sep 8, 2020 By Jenni Boyer In Honeycomb

As a Customer Advocate, I talk to a lot of prospective Honeycomb users who want to understand how observability fits into their existing Site Reliability Engineering (SRE) practice. While I have enough of a familiarity with the discipline to get myself into trouble, I wanted to learn more about what SREs do in their day-to-day work so that I’d be better able to help them determine if Honeycomb is a good fit for their needs.

Read Post

Honeycomb

Read more about SRE + Honeycomb: Observability for Service Reliability

How to Build Your SRE Team

Sep 1, 2020 By Emily Arnott In Blameless

As you implement SRE practices and culture at your organization, you’ll realize everyone has a part to play. From engineers setting SLOs, to management upholding the virtue of blamelessness, to marketing teams conducting retrospectives on email campaigns, there’s no part of an organization that doesn’t benefit from the SRE mentality.

Read Post

Blameless

Read more about How to Build Your SRE Team

SREview Issue #4 August 2020

Aug 21, 2020 By Blameless Community In Blameless

Here’s the August issue of SREview! This monthly zine features epic Tweets, content, and events happening in the SRE and resilience engineering community.

Read Post

Blameless

Read more about SREview Issue #4 August 2020

What is a Kubernetes Operator and Why it Matters for SRE

Aug 20, 2020 By Emily Arnott In Blameless

Kubernetes is an open-source project that “containerizes” workloads and services and manages deployment and configurations. Released by Google in 2015, Kubernetes is now maintained by the Cloud Native Computing Foundation. Since its release, it has become a worldwide phenomenon. The majority of cloud native companies use it, SaaS vendors offer commercial prebuilt versions, and there’s even an annual convention!

Read Post