Latest Posts

Lessons in Incident Response I Learned While Waiting Tables

Dec 13, 2023 By Ashley Sawatsky In Rootly

Before I stumbled into the tech industry (a story for another day), I spent several years in the customer service world as a server and front-of-house manager in restaurants. It was in these jobs that I first honed some critical skills that would later lead me on the path to incident response.

Read Post

Rootly

Read more about Lessons in Incident Response I Learned While Waiting Tables

When More Incident Commanders are Better

Dec 6, 2023 By Strong Liang In Rootly

It has been lightly revised and reposted with his permission from the original article on Medium. Leading major incident responses can be extremely stressful. You have to quickly gather an ad-hoc team, figure out what went wrong, identify a fix and make sure this doesn't make things worse, all the while with senior leadership breathing down your neck. Are we having fun yet? Many people think having a dedicated incident commander role will solve the problem.

Read Post

Rootly

Read more about When More Incident Commanders are Better

Status Pages 101: How to Create a Status Page You and Your Customers Will Actually Want to Use

Nov 2, 2023 By Ashley Sawatsky In Rootly

This blog post is adapted from my talk at SRECon EMEA 2023 - original slides are available here! Status pages are a simple yet underutilized element of incident communication. Done well, they’re a low-lift way to keep your customers and stakeholders informed when incidents impact them. But without a solid approach, updating status pages can easily become a tedious and often neglected task during incidents. In this post, we’ll cover some tips to get your status page right.

Read Post

Rootly

Read more about Status Pages 101: How to Create a Status Page You and Your Customers Will Actually Want to Use

Working Effectively With Executives During an Incident

Oct 2, 2023 By Ashley Sawatsky In Rootly

You’re in the incident channel rocking yet another incident. Comms are flowing, resolution is in sight, the team is grinding, and you’re feeling good. Then…

Read Post

Rootly

Read more about Working Effectively With Executives During an Incident

Top 5 Resiliency Trends of 2023

Sep 20, 2023 By Rohit Ghumare In Rootly

In today’s world, resilience is no longer a conditioned desire or methodology to try but has become a necessity for sustained success in software development and IT operations. As DevOps and Agile teams keep moving forward to cross boundaries, come up with new methodologies, and drive innovation, it is now important to have the ability to quickly recover from failures, adapt to changing conditions, and maintain high performance under pressure.

Read Post

Rootly

Read more about Top 5 Resiliency Trends of 2023

Celebrating Our Nine New G2 Awards

Sep 5, 2023 By JJ Tang In Rootly

We’re proud to share that we've been recognized as a High Performer and Enterprise Leader in Incident Management for the sixth consecutive quarter in the G2 Summer 2023 Report! In total, Rootly received nine G2 awards in the Summer Report.

Read Post

Rootly

Read more about Celebrating Our Nine New G2 Awards

We Need to Talk About the Hero Pattern Among SREs

Aug 22, 2023 By Hans Chung In Rootly

Let’s be honest. When you see an alert pop up on your phone, you aren’t thinking “according to section 12 of our most recent SRE handbook used at training 6 months ago I need to keep in mind who should be Incident Commander and who should be Ops Lead”. You’re an engineer at heart.

Read Post

Rootly

Read more about We Need to Talk About the Hero Pattern Among SREs

But It's Not Our Fault! When Third-party Incidents Affect Your Service

Aug 14, 2023 By Ashley Sawatsky In Rootly

Very few SaaS products exist completely independently. Between cloud service providers, payment processors, content delivery networks, and more, chances are you rely on external systems to keep your product working. When these systems fail, it can leave you feeling pretty helpless. In some cases you might have fallback options, but oftentimes all you can do is wait for recovery and clean up the fallout.

Read Post

Rootly

Read more about But It's Not Our Fault! When Third-party Incidents Affect Your Service

Rootly Raises $12 Million from Renegade Partners, Google Gradient Ventures, & XYZ Ventures

Aug 10, 2023 By JJ Tang In Rootly

We are excited to announce that we have raised a $12M round of financing led by Renegade Partners with participation from Google Gradient Ventures (Google’s AI-focused venture fund) and XYZ Ventures. This brings our total funding to date to $15.2M ($20M CAD) alongside our other existing investors Y Combinator and 8VC.

Read Post

Rootly

Read more about Rootly Raises $12 Million from Renegade Partners, Google Gradient Ventures, & XYZ Ventures

Kubernetes Incident Management Best Practices

Aug 3, 2023 By Rajesh Tilwani In Rootly

Creating just any infrastructure on Kubernetes is not enough. There are so many basic configurations you could apply and create the infrastructure for your application for the time being and it might work just fine. The incident responses won’t always remain 100% reliable. You will run into newer potholes, and that’s okay.

Read Post

Rootly

Read more about Kubernetes Incident Management Best Practices

Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Lessons in Incident Response I Learned While Waiting Tables

When More Incident Commanders are Better

Status Pages 101: How to Create a Status Page You and Your Customers Will Actually Want to Use

Working Effectively With Executives During an Incident

Top 5 Resiliency Trends of 2023

Celebrating Our Nine New G2 Awards

We Need to Talk About the Hero Pattern Among SREs

But It's Not Our Fault! When Third-party Incidents Affect Your Service

Rootly Raises $12 Million from Renegade Partners, Google Gradient Ventures, & XYZ Ventures

Kubernetes Incident Management Best Practices

Monthly Archive

Follow Us