Gremlin

Introduction to Gremlin Reliability Management

Oct 24, 2022 By Gremlin In Gremlin

Gremlin Reliability Management helps teams standardize and automate reliability, one service at a time. In this video, we walk through the platform by showing you how to add your services to Gremlin, integrate your Golden Signals, run reliability tests, and generate reliability scores.

View Video

Gremlin

Read more about Introduction to Gremlin Reliability Management

What is Reliability Management?

Oct 20, 2022 By Andre Newman In Gremlin

Measuring and improving the reliability of technical systems has always been challenging. As an industry, we've developed several practices to try and address reliability concerns, such as incident response, observability, and Chaos Engineering. This led SREs and service owners to measure reliability in a handful of ways.

Read Post

Gremlin

Read more about What is Reliability Management?

Setting better SLOs using Google's Golden Signals

Oct 11, 2022 By Andre Newman In Gremlin

To many engineers, the idea that you can accurately and comprehensively track your application's user experience using just a few simple metrics might sound far-fetched. Believe it or not, there are four metrics that aim to do just that. They're called the four Golden Signals and should be a core part of your observability and reliability practices.

Read Post

Gremlin

Read more about Setting better SLOs using Google's Golden Signals

How to add a Service to Gremlin Reliability Management (RM)

Sep 2, 2022 By Gremlin In Gremlin

This short demo video shows you how to add a Kubernetes service to Gremlin Reliability Management (RM). We'll walk you through selecting the parts of your infrastructure that make up your service, identifying processes for dependency detection, and adding your Golden Signals.

View Video

Gremlin

Read more about How to add a Service to Gremlin Reliability Management (RM)

Introduction to Gremlin Reliability Management (RM)

Sep 2, 2022 By Gremlin In Gremlin

Gremlin Reliability Management helps teams standardize and automate reliability, one service at a time. In this video, we walk through the platform by showing you how to add your services to Gremlin, integrate your Golden Signals, run reliability tests, and generate reliability scores.

View Video

Gremlin

Read more about Introduction to Gremlin Reliability Management (RM)

What is a "service" in a microservices architecture?

Sep 2, 2022 By Andre Newman In Gremlin

The past ten years marked a significant change in how software teams build and deploy applications. We moved away from bulky, slow, monolithic applications toward lightweight, scalable, distributed service-based applications. Meanwhile, tools like Docker, Kubernetes, and other container platforms helped accelerate this process. Despite this sudden growth, a fundamental question remains: what exactly is a service, and how does it fit into a microservice architecture?

Read Post

Gremlin

Read more about What is a "service" in a microservices architecture?

What are the four Golden Signals?

Sep 2, 2022 By Andre Newman In Gremlin

When it comes to building reliable and scalable software, few organizations have as much authority and expertise as Google. Their Site Reliability Engineering Handbook, first published in 2016, details their practices to maintain reliability as Google scaled. But when you have over a million servers running thousands of services across more than twenty data centers, how do you monitor them in a consistent, logical, and relevant way?

Read Post

Gremlin

Read more about What are the four Golden Signals?

Four tests to measure and improve reliability: what matters and how it works

Sep 2, 2022 By Andre Newman In Gremlin

Legendary race car driver Carroll Smith once said, "until we have established reliability, there is no sense at all in wasting time trying to make the thing go faster." Even though he was referring to cars, the same goes for technology: no amount of code optimization or new features can replace stable systems. Unfortunately, much like race cars, it's hard to know that a system is unreliable until it blows a tire, the brakes stop working, or the steering wheel comes off the column.

Read Post

Gremlin

Read more about Four tests to measure and improve reliability: what matters and how it works

How to add a Golden Signal to a service in Gremlin RM

Sep 2, 2022 By Gremlin In Gremlin

In this video, we show you how to add a Golden Signal to a service. Gremlin uses your Golden Signals to ensure your services are still healthy and responsive during reliability tests. You can configure Golden Signals to use an existing monitor in your observability tools, such as Datadog, New Relic, or Prometheus. We recommend adding all four Golden Signals to each of your services to ensure comprehensive coverage.

View Video

Gremlin

Read more about How to add a Golden Signal to a service in Gremlin RM

How to define and measure the reliability of a service

Jul 14, 2022 By Andre Newman In Gremlin

More and more teams are moving away from monolithic applications and towards microservice-based architectures. As part of this transition, development teams are taking more direct ownership over their applications, including their deployment and operation in production. A major challenge these teams face isn't in getting their code into production (we have containers to thank for that), but in making sure their services are reliable.

Read Post

Gremlin

Read more about How to define and measure the reliability of a service

Operations | Monitoring | ITSM | DevOps | Cloud

Gremlin

Introduction to Gremlin Reliability Management

What is Reliability Management?

Setting better SLOs using Google's Golden Signals

How to add a Service to Gremlin Reliability Management (RM)

Introduction to Gremlin Reliability Management (RM)

What is a "service" in a microservices architecture?

What are the four Golden Signals?

Four tests to measure and improve reliability: what matters and how it works

How to add a Golden Signal to a service in Gremlin RM

How to define and measure the reliability of a service

Monthly Archive

Follow Us