Operations | Monitoring | ITSM | DevOps | Cloud

SRE

The latest News and Information on Service Reliability Engineering and related technologies.

SRE Principles: The 7 Fundamental Rules

In one of our previous articles, we discussed what an SRE is, what they do, and some of the common responsibilities that a typical SRE may have, like supporting operations, dealing with trouble tickets and incident response, and general system monitoring and observability. In this article, we will take a deeper dive into the various SRE principles and guidelines that a site reliability engineer practices in their role.

How to improve your influence as an SRE

Improving your influence over the company will help you deliver high quality work as your goals will be closely aligned with those of the company. In this blog piece, Ricardo has explained how to improve your influence as an SRE. Balancing fast-paced business requirements with the demands of keeping production services stable is not an easy task.

Enabling SRE best practices: new contextual traces in Cloud Logging

The need for relevant and contextual telemetry data to support online services has grown in the last decade as businesses undergo digital transformation. These data are typically the difference between proactively remediating application performance issues or costly service downtime. Distributed tracing is a key capability for improving application performance and reliability, as noted in SRE best practices.

Podcast: Break Things on Purpose | Gustavo Franco, Senior Engineering Manager at VMWare

In this episode Jason is joined by Gustavo Franco, Senior Engineering Manager at VMWare, to chat about chaos in the Gustavo’s early days. Gustavo reflects on Googles early disaster recovery practices, to the contemporary SRE movement.

How they SRE: Insights from the Cloudflare SRE team

Cloudflare is a global cloud services provider that is based all over the globe, from San Francisco, US to London, England to Sydney, Australia. Their mission, as stated front and center on their homepage, is to help build a better Internet. While that may read like hyperbole, their numbers are impressive - Cloudflare has over 126,000 paying customers and 95% of Internet Users in the developed world are within 50ms of their network.