Latest Posts

De-Siloing Incident Management: How to Make Reliability Engineering Everyone's Job

Jul 15, 2021 By JJ Tang In Rootly

4 best practices for breaking down silos and establishing a culture of shared responsibility toward reliability.

Read Post

Rootly

Read more about De-Siloing Incident Management: How to Make Reliability Engineering Everyone's Job

Rootly Announces $3.2 Million in Seed Funding from XYZ Venture Capital, 8VC, & Y Combinator

Jul 8, 2021 By Quentin Rousseau In Rootly

Rootly is on a mission to create a world where maintaining reliability is frictionless, delightful, and accessible to anyone. Making resolving and learning from incidents every organizations superpower.

Read Post

Rootly

Read more about Rootly Announces $3.2 Million in Seed Funding from XYZ Venture Capital, 8VC, & Y Combinator

The Incident Review: 4 Incidents in Outer Space

Jul 6, 2021 By JJ Tang In Rootly

From network problems to computer failures, a variety of incidents can disrupt operations for systems in outer space.

Read Post

Rootly

Read more about The Incident Review: 4 Incidents in Outer Space

7 Essential Tools for SREs

Jun 25, 2021 By Quentin Rousseau In Rootly

From chaos engineering to monitoring and beyond, SREs rely on several key types of tools to do their jobs.

Read Post

Rootly

Read more about 7 Essential Tools for SREs

Practical Guide to SRE: Incident Severity Levels

Jun 17, 2021 By Nancy Chauhan In Rootly

Incident severity levels are a measurement of the impact an incident has on the business. Classifying the severity of an issue is critical to decide how quickly and efficiently problems get resolved.

Read Post

Rootly

Read more about Practical Guide to SRE: Incident Severity Levels

The Incident Review: 4 Times When Typos Brought Down Critical Systems

Jun 3, 2021 By JJ Tang In Rootly

Sometimes, as these 4 incidents highlight, major failure results from a mere typo or configuration oversight.

Read Post

Rootly

Read more about The Incident Review: 4 Times When Typos Brought Down Critical Systems

Incident Management vs. Incident Response - What's the Difference?

May 28, 2021 By Quentin Rousseau In Rootly

What are the differences between incident management and incident response? The answer varies widely depending on whom you ask.

Read Post

Rootly

Read more about Incident Management vs. Incident Response - What's the Difference?

The Incident Review: 4 Odd Incidents Caused by Animals

May 21, 2021 By JJ Tang In Rootly

Incidents and outages caused by animals highlight the importance of flexibility and out-of-the-box thinking when it comes to SRE.

Read Post

Rootly

Read more about The Incident Review: 4 Odd Incidents Caused by Animals

Practical Guide to SRE: Using SLOs to Increase Reliability

May 13, 2021 By Quentin Rousseau In Rootly

Service Level Objectives (SLOs) are a key component of any successful Site Reliability Engineering initiative. The question is, what are SLOs; and how do you determine what your SLOs should be? Once you've done that, how should you use them?

Read Post

Rootly

Read more about Practical Guide to SRE: Using SLOs to Increase Reliability

Practical Guide to SRE: Automating On-Call

May 6, 2021 By JJ Tang In Rootly

Let's all face it, on call work isn't fun. But it can be better. Even if you have to work on call, it would be nice to have at least some of the work done for you, before you drag yourself out of bed at 3am to respond to an incident.

Read Post

Rootly

Read more about Practical Guide to SRE: Automating On-Call

Operations | Monitoring | ITSM | DevOps | Cloud

De-Siloing Incident Management: How to Make Reliability Engineering Everyone's Job

Rootly Announces $3.2 Million in Seed Funding from XYZ Venture Capital, 8VC, & Y Combinator

The Incident Review: 4 Incidents in Outer Space

7 Essential Tools for SREs

Practical Guide to SRE: Incident Severity Levels

The Incident Review: 4 Times When Typos Brought Down Critical Systems

Incident Management vs. Incident Response - What's the Difference?

The Incident Review: 4 Odd Incidents Caused by Animals

Practical Guide to SRE: Using SLOs to Increase Reliability

Practical Guide to SRE: Automating On-Call

Monthly Archive

Follow Us