Operations | Monitoring | ITSM | DevOps | Cloud

Chaos Engineering

Gremlin ALFI Demo - AWS RDS Unavailable - Chaos Engineering

In this demo, we'll share how you can use ALFI (Application Level Failure Injection) to make AWS RDS unavailable. This enables you to learn how your application handles different failure modes. We'll be using the ALFI Latency attack to perform this Chaos Engineering experiment.

Announcing the Gremlin Chaos Engineering Practitioner Certificate Program

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. Chaos Engineering continues to grow in popularity and is rapidly becoming a job requirement. To help Engineering and Testing teams meet the need, we’re launching our first ever Gremlin Chaos Engineering Practitioner Certificate Program!

Datadog on Chaos Engineering

As you scale your applications, remaining resilient to underlying network failures, resource constraints introduced by other applications, or spikes in traffic can become exponentially more complex, even with very thorough testing and processes. Chaos engineering is a discipline that encourages experimenting in production and injecting controlled failures into the system to understand how the system will react in such conditions and to improve its reliability.

Podcast: Break Things on Purpose | Jose Nino, Staff Software Engineer at Lyft

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. Break Things on Purpose is a podcast for all-things Chaos Engineering. Check out our latest episode below. You can subscribe to Break Things on Purpose wherever you get your podcasts. If you have feedback about the show, find us on Twitter at @BTOPpod or shoot us a note at podcast@gremlin.com!

Failover Conf follow-up: Your team and culture questions answered!

Thank you all for joining us last week for Failover Conf 2! We had a great turnout this year, with over 1,800 participants, 20 sponsors, and 9 amazing sessions. After more than a year of virtual events and video calls, we know that Zoom fatigue is real. We tried to make this event different by finding new ways to bring the community together and thinking of fun new ways to shake up the conference formula.

Fireside Chat with Jesse Robbins and Kolton Andrus Failover Conf 2021

Long before Chaos Engineering was even a phrase, Jesse Robbins was Amazon.com's "Master of Disaster" using intentional failure to help the company become more reliable. Kolton Andrus (CEO at Gremlin), sits down with Jesse to learn more about his early work with GameDays, the evolution of reliability, and where the future of SRE lies.

Fireside Chat with Ines Sombra and Ana Medina Failover Conf 2021

Reliability is a requirement for the modern internet. Ana Medina joins Inés Sombra, Sr. Director of Engineering at Fastly, to discuss their approach to resilience, how the past year has influenced the way they work, and what practices your engineering organization can adopt to become more reliable.