Operations | Monitoring | ITSM | DevOps | Cloud

The Link Between Early Detection and Internet Resilience: A Lesson from Salesforce's Outage

Almost every study examining the hourly cost of outages invariably leads to a clear and undeniable conclusion: outages are expensive. According to a 2016 study, the average cost of downtime was estimated at approximately $9,000 per minute. In a more recent study, 61% of respondents stated that outages cost them at least $100,000, with 32% indicating costs of at least $500,000 and 21% reporting expenses of at least $1 million per hour of downtime.

The Single Pane of Glass in Modern Observability

Recently I caught up with Jamie Allen on Episode 67 of the Slight Reliability podcast to discuss the idea of a single pane of glass (SPOG). Jamie had written an article titled The Single Pain of Glass which coincidentally was what I titled Slight Reliability Episode 10. I thought given our shared use of puns and this topic that it was worth a conversation! So, what is a single pane of glass? Is it an idea with practical application? How does it fit into the world of modern observability?

Charmed Kubeflow 1.8 Beta is here

Have you heard the news? Charmed Kubeflow 1.8 is available in Beta. Kubeflow is the foundation of Canonical MLOps. The latest release brings improved capabilities to personalise different components of the platform, including the images that can be used in Notebooks. We are looking for data scientists, machine learning engineers, creators and AI enthusiasts to take Charmed Kubeflow 1.8 Beta for a test drive and share their feedback with us.

Harmonizing Digital Channels and Business Operations to Deliver a Good Customer Experience

In celebration of Customer Experience Day 2023, this post is part of a series on customer experience and the ways that Splunk strifves to deliver superior customer experience at every level. Today, customers interact with brands through a variety of channels and platforms. In fact, 57% of customers prefer to engage with brands through digital channels first.

Simplifying Microsoft Teams Troubleshooting for IT Teams

Microsoft Teams has become the go-to platform for seamless collaboration and communication. However, like any technology, performance issues can arise, and these issues affect user experience and productivity. For IT teams tasked with Microsoft Teams troubleshooting, having access to comprehensive data is key. In this blog, we explore the challenges faced by IT teams and how harnessing more data can make the process significantly easier.

How We Did It: Data Ingest and Compression Gains in InfluxDB 3.0

A few weeks ago, we published some benchmarking that showed performance gains in InfluxDB 3.0 that are orders of magnitude better than previous versions of InfluxDB – and by extension, other databases as well. There are two key factors that influence these gains: 1. Data ingest, and 2. Data compression. This begs the question, just how did we achieve such drastic improvements in our core database? This post sets out to explain how we accomplished these improvements for anyone interested.

Top 10 Tools to Monitor Core Web Vitals of Your Website

What guarantees the success of a website today isn’t just its content and design; delivering a seamless and efficient user experience (UX) is also extremely critical. This is where Core Web Vitals are important as they provide a collection of performance metrics to evaluate the quality of website user experience. Core Web Vitals are critical to attract visitors and retain them as they directly impact a site’s visibility on Google.

Configuration Drift: Understanding, Avoiding, Managing and Resolving in Kubernetes

If you work with Kubernetes, you know that any number of issues can pose a serious threat to the stability and security of your deployments. One that's subtly damaging is configuration drift, which occurs when the actual state of how your system is set up — its configuration — strays from the way you defined. Configuration drift in Kubernetes can happen when people make changes manually, systems aren't synchronized properly or monitoring falls short.

What is a SharePoint Site Collection?

SharePoint, born from the tech giant Microsoft, is not just another application; it’s a robust platform that’s been transforming the way businesses handle their internal processes for years. At its core, SharePoint is designed to streamline collaboration and document management. But what does that mean in layman’s terms? Imagine a vast digital library, where instead of books, you have documents, images, videos, and other digital content.

Announcing HAProxy Enterprise 2.8 & HAProxy ALOHA 15.5

HAProxy Enterprise 2.8 and HAProxy ALOHA 15.5 are now available. Users of our enterprise-class software load balancer and hardware/virtual load balancer appliance who upgrade to the latest versions will benefit from all the features announced in the community version, HAProxy 2.8, plus some features that enhance the flexibility of our enterprise security options to meet even more use cases.