Latest News

SREview Issue #16 August 2021

Aug 10, 2021 By Blameless Community In Blameless

We’re kicking off August with some thrilling news: Blameless has closed a $30M Series B fund raise! Learn more about how we’re entering the next phase of our journey to advance reliability for engineering teams here. We’re so grateful to our customers, collaborators, and the entire SRE community for their support! Let’s dive in with our favorite content for the month!

Read Post

Blameless

Read more about SREview Issue #16 August 2021

Incident Management Goes to the Olympics

Aug 5, 2021 By Quentin Rousseau In Rootly

A look at outages and disruptions to the IT systems that power the Olympics, from 1996 to today.

Read Post

Rootly

Read more about Incident Management Goes to the Olympics

Demystifying DevOps and SRE

Aug 4, 2021 By James Samuel In Squadcast

How different are DevOps and SRE? Are they related to each other? In this blog, James Samuel sheds light on the similarities & differences between SRE & DevOps followed by the possible ways to structure an SRE team in your organization. One of the terms that people often find confusing is SRE and DevOps. People often ask, should I hire a DevOps Engineer or a Site Reliability Engineer? What is the difference between SRE and DevOps and which one do I need? In this post, I attempt to shed some light.

Read Post

Squadcast

Read more about Demystifying DevOps and SRE

New Product Integration! Microsoft Teams Video

Aug 3, 2021 By Emily Arnott In Blameless

On the heels of our Microsoft Teams integration release to streamline incident management, we’re excited to share that we now support Microsoft Teams Video capabilities. We generate Microsoft Teams video conference links for each Blameless incident for fast and easy collaboration. Microsoft Teams Video joins Zoom, Google Meet, and GoToMeeting in our video integration suite.

Read Post

Blameless

Read more about New Product Integration! Microsoft Teams Video

Resilience in Action E9: Vulnerability, Compassion, and Post-Incident Reviews in the Emergency Room with Dr. Al'ai Alvarez

Aug 2, 2021 By Christina Tan In Blameless

‍ What can software engineers learn from post-incident reviews that physicians do in the emergency room? In our ninth episode, Christina, member of the Blameless strategy team, guest-hosts the podcast to interview both Kurt Andersen and Al'ai Alvarez, MD (@alvarezzzy). Dr. Alvarez is an assistant clinical professor of Emergency Medicine at Stanford. Clinically, he’s an emergency physician.

Read Post

Blameless

Read more about Resilience in Action E9: Vulnerability, Compassion, and Post-Incident Reviews in the Emergency Room with Dr. Al'ai Alvarez

The Unique Reliability Engineering Requirements of Microservices

Jul 30, 2021 By JJ Tang In Rootly

Although the fundamental concepts of site reliability engineering are the same in any environment, SREs must adapt practices to different technologies, like microservices.

Read Post

Rootly

Read more about The Unique Reliability Engineering Requirements of Microservices

Most frequently asked questions surrounding Google's Cloud Operations Sandbox

Jul 29, 2021 By Nir Sharma In Squadcast

Cloud Operations Sandbox serves as a simulation tool for budding SREs to learn the best practices from Google and apply them to real cloud services. In this blog, we have compiled a list of FAQs surrounding the use of Google's Cloud Operations Sandbox. The Google SRE sandbox provides an easy way to get started with the core skills you need to become a SRE.

Read Post

Squadcast

Read more about Most frequently asked questions surrounding Google's Cloud Operations Sandbox

What are the Four Golden Signals?

Jul 29, 2021 By Blameless In Blameless

SRE’s Golden Signals are four key metrics used to monitor the health of your service and underlying systems. We will explain what they are, and how they can help you improve service performance.

Read Post

Blameless

Read more about What are the Four Golden Signals?

Reliability Matters. Blameless is Growing with Series B $30M Funding

Jul 27, 2021 By Lyon Wong In Blameless

When Blameless started in 2018, the team set out on a mission to help all engineers achieve reliability with less toil and risk. Three years in, that mission has become more important than ever. What has changed is the rate of SRE adoption, now the fastest growing team and practice inside engineering. This represents a clear recognition of the many upsides that an SRE practice brings with its combination of continuous learning, velocity, and resilience.

Read Post

Blameless

Read more about Reliability Matters. Blameless is Growing with Series B $30M Funding

How to Notify Your Team of Errors: Email vs. Slack vs. PagerDuty

Jul 26, 2021 By LogDNA In Mezmo

Site Reliability Engineering (SRE) and Operations (Ops) teams heavily rely on notifications. We use them to know what’s going on with application workloads and how applications are performing. Notifications are critical to ensuring SREs and Ops teams can resolve errors and reduce downtime. They’re also crucial when monitoring environments — not only when running in production but also during the dev-test or staging phase.

Read Post

Mezmo

Read more about How to Notify Your Team of Errors: Email vs. Slack vs. PagerDuty

Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

SREview Issue #16 August 2021

Incident Management Goes to the Olympics

Demystifying DevOps and SRE

New Product Integration! Microsoft Teams Video

Resilience in Action E9: Vulnerability, Compassion, and Post-Incident Reviews in the Emergency Room with Dr. Al'ai Alvarez

The Unique Reliability Engineering Requirements of Microservices

Most frequently asked questions surrounding Google's Cloud Operations Sandbox

What are the Four Golden Signals?

Reliability Matters. Blameless is Growing with Series B $30M Funding

How to Notify Your Team of Errors: Email vs. Slack vs. PagerDuty

Monthly Archive

Follow Us