Gremlin

How Detected Risks helps you find reliability risks in minutes-without running any tests

Aug 30, 2023 By Gremlin In Gremlin

This video showcases Gremlin's Detected Risks feature. Detected risks are high-priority reliability concerns that Gremlin automatically identifies in an environment. These include misconfigurations, bad default values, and reliability anti-patterns. Gremlin prioritizes these risks based on severity and impact, giving instantaneous feedback on risks and action items to improve the reliability and stability of each service.

View Video

Gremlin

Read more about How Detected Risks helps you find reliability risks in minutes-without running any tests

Four Pillars of a Best-in-Class Reliability Program

Aug 30, 2023 By Gavin Cahill In Gremlin

Reliability impacts every organization, whether you plan for it or not. Leading companies take matters into their own hands and get ahead of incidents by building reliability programs. But since many of these programs are still nascent, how do you know what good looks like? Of course, the right tools and technology that can enable your team to uncover reliability risks before they impact users play an important role. But improving reliability goes beyond technology.

Read Post

Gremlin

Read more about Four Pillars of a Best-in-Class Reliability Program

Announcing the Gremlin Enterprise Chaos Engineering Certification (GECEC) program

Aug 23, 2023 By Andre Newman In Gremlin

We knew Chaos Engineering was in high demand when we first launched the Gremlin certifications in 2021. But we had no idea our Chaos Engineering certification programs would be such a success. There’s a reason: the market is looking for professionals who know how to wield Chaos Engineering well, and Gremlin's certification has become the gold-standard to learn the principles of Chaos Engineering and demonstrate proficiency.

Read Post

Gremlin

Read more about Announcing the Gremlin Enterprise Chaos Engineering Certification (GECEC) program

Reliability Best Practices: How Gremlin Uses Gremlin

Aug 7, 2023 By Gavin Cahill In Gremlin

Ensuring software availability is essential for any SaaS company—including Gremlin. To do that, our teams need to identify the reliability risks hiding in our systems. That’s why our development, platform, and SRE teams use Gremlin regularly to perform Chaos Engineering experiments, run reliability tests, and track the reliability of our systems against our standards. Along the way they’ve picked up a thing or two about how to find and fix reliability risks with Gremlin.

Read Post

Gremlin

Read more about Reliability Best Practices: How Gremlin Uses Gremlin

How to use the reliability tracker spreadsheet

Jul 20, 2023 By Gremlin In Gremlin

Learn how to identify and track reliability risks, prioritize fixes, and prove results to your organization using Gremlin's free reliability tracker spreadsheet.

View Video

Gremlin

Read more about How to use the reliability tracker spreadsheet

How to Show Reliability Results to Your Organization

Jun 1, 2023 By Gavin Cahill In Gremlin

Building momentum for a reliability program can be tough. Improving reliability takes time, effort, and resources. But when everything from launching new features to improving security demands those same resources, it can be a struggle to get the buy-in you need to address reliability risks. And it makes sense! If a team spends time patching a known security bug or creating a new feature, they have a clear demonstration of the value created.

Read Post

Gremlin

Read more about How to Show Reliability Results to Your Organization

Don't Just React to Incidents-Prevent Them

May 9, 2023 By Gavin Cahill In Gremlin

Incident response has been the cornerstone of reliability for decades. From digging in the server logs to navigating modern observability dashboards, responding quickly to incidents and outages is a big part of minimizing downtime. And it should be! When something breaks, your team should move as quickly as possible to address and repair the problem.

Read Post

Gremlin

Read more about Don't Just React to Incidents-Prevent Them

Chaos Engineering Tools: Myth vs Fact

Apr 4, 2023 By Gavin Cahill In Gremlin

With so many Chaos Engineering tools available, it’s no surprise that SRE and platform leaders are doing their homework when choosing a platform to help them build and scale their Chaos Engineering programs. But like anything else you can research on the internet, there’s a lot of noise and hype that you need to wade through. Gremlin works with Reliability Engineering teams at hundreds of companies with the most sensitive workloads—and has since 2016.

Read Post

Gremlin

Read more about Chaos Engineering Tools: Myth vs Fact

What is Gremlin?

Mar 28, 2023 By Gremlin In Gremlin

Today’s technology leaders are facing a reliability gap. Customers expect their apps to be fast and available. But with Devops and distributed systems driving more speed and complexity, it’s harder than ever to find and fix the reliability risks that can impact customer experience–before it’s too late. To close the Reliability gap, we need a reliability strategy. One that’s proactive, measurable, built-in and automated. We need a reliability management platform.

View Video

Gremlin

Read more about What is Gremlin?

Five Trends from SREcon Americas 2023

Mar 27, 2023 By Gavin Cahill In Gremlin

Last week, over five hundred SREs gathered in Santa Clara to share the latest research, tips, tricks, best practices, and more for site reliability engineering. They were joined by some of the biggest names in the reliability space. And, yes, Gremlin was there to answer any and all questions about chaos engineering and proactive reliability. After three days of great conversations and insightful talk, let’s take a look at some of the themes we heard weaving through SRECon.

Read Post

Gremlin

Read more about Five Trends from SREcon Americas 2023

Subscribe to Gremlin

Operations | Monitoring | ITSM | DevOps | Cloud

Gremlin

How Detected Risks helps you find reliability risks in minutes-without running any tests

Four Pillars of a Best-in-Class Reliability Program

Announcing the Gremlin Enterprise Chaos Engineering Certification (GECEC) program

Reliability Best Practices: How Gremlin Uses Gremlin

How to use the reliability tracker spreadsheet

How to Show Reliability Results to Your Organization

Don't Just React to Incidents-Prevent Them

Chaos Engineering Tools: Myth vs Fact

What is Gremlin?

Five Trends from SREcon Americas 2023

Monthly Archive

Follow Us