Are you testing for known reliability vulnerabilities?
Are you testing for known reliability vulnerabilities?
"Risks have different priorities, but ultimately we want to be aware of those risks.
Just like we want our security team to go scan for known vulnerabilities, our reliability team should be scanning for known vulnerabilities. And those are easy things we should go address.
There's a second part of it, which is kind of just good engineering testing, which is: Hey, we have a set of test cases that we know need to pass.
What happens if we lose a dependency? What happens if we lose an availability zone? What happens if we shift over a region? Those are important test cases. Are we testing them? Do we have tests that cover them? Are we running them on a regular basis?"
—Kolton Andrus, Gremlin CTO