Operations | Monitoring | ITSM | DevOps | Cloud

How isolation improves queries in Prometheus 2.17

There are instances in life when isolation is actually welcome. One of those instances pertains to the I in the acronym ACID, which outlines the key properties necessary to maintain the integrity of transactions in a database. The time series database (TSDB) embedded in the Prometheus server has the C (consistency), the D (durability), and – somewhat debatable – the A (atomicity). But up until and including Prometheus v2.16, it did not have the I (isolation).

The Lifecycle of a Response

Last year, the Skylight team gave a talk called Inside Rails: The Lifecycle of a Request. In that talk, we covered everything that happens between typing a URL into your browser to a request reaching your Rails controller action. But that talk ended with a cliffhanger: Once we are in the controller action, how does Rails send our response back to the browser?

Performing chaos in a serverless world  Gunnar Grosch  Failover Conf 2020

Chaos engineering is the practice of hypothesis testing through planned experiments to gain a better understanding of a system’s behavior. The principles of chaos engineering have been around for years, and we have now reached the point where chaos engineering has gone from just being a buzzword and practice used by a few large organizations in very specific fields, to it being put in to use by companies of all sizes and industries.

Swim Don't Sink: Why Training Matters to a Site Reliability Engineering Practice  Jennifer Petoff

Do you offer training to the engineers in your organization or do you throw them off the deep end to “sink or swim”? Providing training and education is universally important to set team members up for success in your organization and is critical for establishing a thriving Site Reliability Engineering (SRE) or DevOps practice and culture in the first place.

Fight, Flight, or Freeze - Releasing Organizational Trauma Matt Stratton Failover Conf 2020

When humans are faced with a traumatic experience, our brains kick in with survival mechanisms. These mechanisms are the familiar fight or flight response, but can also include the freeze response - which occurs when we are terrified or feel that there is no chance of escape.

Y2K and Other Disappointing Disasters: Risk Reduction and Harm Mitigation  Heidi Waterhouse

Every disaster is a concatenation of smaller failures. How can we design software and processes to accept that we live in an imperfect world? Explore the concepts of resiliency, harm reduction, over-engineering, and planning for failure with real examples.