Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

How to improve your influence as an SRE

Improving your influence over the company will help you deliver high quality work as your goals will be closely aligned with those of the company. In this blog piece, Ricardo has explained how to improve your influence as an SRE. Balancing fast-paced business requirements with the demands of keeping production services stable is not an easy task.

Enabling SRE best practices: new contextual traces in Cloud Logging

The need for relevant and contextual telemetry data to support online services has grown in the last decade as businesses undergo digital transformation. These data are typically the difference between proactively remediating application performance issues or costly service downtime. Distributed tracing is a key capability for improving application performance and reliability, as noted in SRE best practices.

Podcast: Break Things on Purpose | Gustavo Franco, Senior Engineering Manager at VMWare

In this episode Jason is joined by Gustavo Franco, Senior Engineering Manager at VMWare, to chat about chaos in the Gustavo’s early days. Gustavo reflects on Googles early disaster recovery practices, to the contemporary SRE movement.

How they SRE: Insights from the Cloudflare SRE team

Cloudflare is a global cloud services provider that is based all over the globe, from San Francisco, US to London, England to Sydney, Australia. Their mission, as stated front and center on their homepage, is to help build a better Internet. While that may read like hyperbole, their numbers are impressive - Cloudflare has over 126,000 paying customers and 95% of Internet Users in the developed world are within 50ms of their network.

DevOps Culture: How to Build a Stronger Team

Trying to improve your DevOps team? We’ll explain what DevOps culture is, how it benefits your team, and how you can build it within your organization. So what is DevOps culture? The main goals of DevOps culture are to increase collaboration and communication between teams, to give all participants a shared responsibility in the project, and to emphasize learning opportunities instead of spreading blame when things go wrong.

Site Reliability Engineer (SRE) Roles and Responsibilities

Software development is getting faster and more complex – frustrating IT operations teams more than ever. So, DevOps gained popularity in order to combat siloed workflows, decreased collaboration and a lack of visibility. While establishing a culture of DevOps has helped teams collaborate better and deliver reliable software faster, DevOps teams don’t necessarily have someone specifically dedicated to developing systems that increase site reliability and performance.