The latest News and Information on Observabilty for complex systems and related technologies.
The year is over, and the word ‘Observability’ has been one of the buzzwords that kept everyone checking throughout the year for deserving reasons. The organizations do not want to leave any stone unturned to maintain performance and offer robust services from ‘monitoring’ practices to ‘observability’, ‘telemetry’, and visibility capacities. So let’s get into the meaning of each term and understand how they are vital for business growth.
A new year is a chance to have a new start, and one thing that it’s a great opportunity to think about is the monitoring and observability platform you’re using for your applications. If you’ve been using a legacy monitoring system, you’ve probably heard about observability all over the ‘net and want to figure out if this is really something you need to care about.
With more than 1.5M room nights booked per day, Booking.com requires a solid infrastructure that’s constantly monitored. And indeed, Booking.com now has a footprint of 50,000+ physical servers running across four data centers and six additional points of presence. The sheer size of this server fleet makes it viable for Booking.com to have dedicated teams specializing into looking only at the reliability of those servers.
Today’s systems are more distributed, dynamic, and complex than ever before – plus, users have more expectations. Also, the historical reliance on an operations team to monitor, triage, and/or resolve issues has become untenable as the number of services increased. This means that many of the tools that were well-suited before might no longer be adequate.
You need not fear a long-lived streaming workload. A few simple tricks can transform a request that may not ever terminate for hours or days into something you can get regular health and status updates on. We in fact have one of those continuous processing services—Beagle, our Service Level Objective stream processor—which we’ve instrumented in this fashion.
Unlike traditional IT Ops, the role of the SRE isn’t simply focused on finding and solving technical problems. The big win for today’s SREs is supporting the organization’s strategic innovation initiatives. With the appropriate observability capabilities, it’s possible to quantify the value that software infrastructure contributes to this innovation effort.
What’s the first thing most people do when they’re unhappy with a business? Take to social media to complain about it. Observing those comments – otherwise known as “user sentiment observability” – gives you a head’s up as to when problems become big enough to impact user experience. How can you monitor that voice of the customer? And why is it important to do so? Let’s take a deeper look at the issues.