%term

The latest News and Information on Service Reliability Engineering and related technologies.

Beginners Guide to Incident Postmortems

Feb 7, 2021 By Camille Hodoul In Rootly

Successful and blameless postmortems can turn incidents into a gift of learning and prevent repeat mistakes.

Read Post

Rootly

Read more about Beginners Guide to Incident Postmortems

7 Tips On Building And Maintaining An SRE Team In Your Company

Jan 22, 2021 By Squadcast Community In Squadcast

In today's "always on" world, Reliability is a primary business KPI. Plant the culture of Reliability by implementing these 7 simple tips to build a solid SRE team in your organization. Many of today’s hottest jobs didn’t exist at the turn of the millennium. Social media managers, data scientists, and growth hackers were never heard of before. Another relatively new job role in demand is that of a Site Reliability Engineer or SRE. The profession is quite new.

Read Post

Squadcast

Read more about 7 Tips On Building And Maintaining An SRE Team In Your Company

Take the first step toward SRE with Cloud Operations Sandbox

Jan 22, 2021 By Simon Zeltser In Google Operations

At Google Cloud, we strive to bring Site Reliability Engineering (SRE) culture to our customers not only through training on organizational best practices, but also with the tools you need to run successful cloud services. Part and parcel of that is comprehensive observability tooling—logging, monitoring, tracing, profiling and debugging—which can help you troubleshoot production issues faster, increase release velocity and improve service reliability.

Read Post

Google Operations

Read more about Take the first step toward SRE with Cloud Operations Sandbox

The Key Differences between SLI, SLO, and SLA in SRE

Jan 20, 2021 By Biju Chacko In Squadcast

To incentivize reliability in your platform, there should be shared goals across your team to measure & quantify the capabilities of your product/service along with customer experience. Define the path of "Always-On" services by understanding few key SRE fundamentals and their implications - SLIs, SLOs & SLA. Framing SRE metrics for building or scaling a product is quite a daunting task.

Read Post

Squadcast

Read more about The Key Differences between SLI, SLO, and SLA in SRE

2021 is the Year of Reliability

Jan 20, 2021 By Robert Ross In FireHydrant

There’s no better time than now to dedicate effort to reliable software. If it wasn’t apparent before, this past year has made it more evident than ever: People expect their software tools to work every time, all the time. The shift in the way end-users think about software was as inevitable as our daily applications entered our lives, almost like water and electricity entered our homes.

Read Post

FireHydrant

Read more about 2021 is the Year of Reliability

Building and Scaling Your SRE Team

Jan 12, 2021 By Julie Gunderson In PagerDuty

Building Site Reliability Engineering (SRE) teams is hard! There are so many articles and explanations of what SRE means, it’s easy to get lost. Going beyond understanding what the individual SRE role is into building and scaling a team of SREs is more of a challenge. It’s important to find the right information that will help you take your SRE team to the next level.

Read Post

PagerDuty

Read more about Building and Scaling Your SRE Team

Top Observability tools for DevOps Engineers and SREs

Dec 28, 2020 By Nir Sharma In Squadcast

Better visibility is the first step to improved system stability. Our latest blog outlines Top Observability tools for DevOps Engineers & SREs to help you get started on your journey to gain valuable insights into your infrastructure. “We can't fix something which we can't observe” - whether it's a steam engine or a complex microservice based cloud deployment, great observability makes troubleshooting things easier.

Read Post

Squadcast

Read more about Top Observability tools for DevOps Engineers and SREs

From SysAdmin to SRE: How to evolve your skillset

Dec 16, 2020 By Biju Chacko In Squadcast

Are you wondering what it takes to become an SRE from a SysAdmin background? Our latest blog, covers the growth areas and technical skills needed to successfully transition to an SRE role. The last decade has seen widespread adoption of SRE practices based on the best practices laid out by Google. Many SysAdmins have observed this trend and are now evaluating becoming SREs. Which gives rise to the question how much of a skills overlap is there between an SRE and a SysAdmin?

Read Post

Squadcast

Read more about From SysAdmin to SRE: How to evolve your skillset

How to SRE without an SRE on your team

Nov 27, 2020 By Biju Chacko In Squadcast

Are terms like “Error budgets” and SLOs roadblocks on your way to adopting SRE practices for your organisation? Our latest blog talks of "How to SRE without an SRE on your team", where we look at some of the most elementary SRE concepts that you can start implementing right away! We help you pick SLOs, identify toil and touch base on Automation for SREs along with few best practices to get you started on your SRE journey.

Read Post

Squadcast

Read more about How to SRE without an SRE on your team

Top Open Source projects for SREs and DevOps

Nov 13, 2020 By Squadcast In Squadcast

Building scalable and highly reliable software systems is the ultimate goal of every SRE out there. Follow the path of continuous learning with the help of our latest blog which outlines some of the most sought out open source projects in the monitoring, deployment & maintenance space. The path to becoming a successful SRE lies in continuous learning.

Read Post