Operations | Monitoring | ITSM | DevOps | Cloud

StackState

Part 2: Monitoring - Level 1

The first level of the Observability Maturity Model, Monitoring, is not new to IT. But as reliable IT system operation becomes more and more critical, the importance of monitoring continues to increase. A monitor tracks a specific parameter of an individual component in the system to make sure it stays within an acceptable range; if the value moves out of the range, the monitor triggers an action, such as an alert, state change or warning.

Using Observability with Kubernetes to Automate Site Reliability Engineering

In this video, Anthony Evans, solution architect, explains how the StackState topology-powered observability platform can help SREs to automate site reliability, putting their organizations on the path to becoming a zero-downtime enterprise. See how StackState helps to unify and correlate data across your stack, visualize your entire IT environment, instantly pinpoint root cause, reduce alert storms and with AIOps capabilities, even prevent problems proactively. It's all here!

Changes are Observability's Biggest Blind Spot

Classically, the space of observability lies within layers of information on a dashboard. It operates by using the fundamental trio of data — metrics, logs and traces — from each layer of the environment to assess the health of an IT infrastructure. However, a time component is critical, making the stack observable at any point in time. Gathering reliable data and insights into your IT infrastructure remains the primary role of observability tools and services.

Real World Insights - My Take on the Observability Maturity Model

A prelude to our upcoming six-part Observability Maturity Model Fundamentals blog series. By Lodewijk Bogaards At StackState, we have spent eight years in the monitoring and observability spaces. During this time, we have spoken with countless DevOps engineers, architects, SREs, heads of IT operations and CTOs, and we have heard the same struggles over and over.

Anomaly Detection and AIOps - Your On-Call Assistant for Intelligent Alerting and Root Cause Analysis

In this blog, we examine how anomaly detection helps by setting up healthy alerts and providing efficient root cause analysis. Anomaly detection, part of AIOps, guides your attention to the places and times where remarkable things occurred. It reduces information overload, thereby speeding up RCA investigation.

Site Reliability Engineering, Site Reliability Engineers and SRE Practices: State of Adoption

Site reliability engineering (SRE) is what you get when you treat operations as if it’s a software problem. The mission of an SRE practice is to protect, provide for and progress the software and systems offered and managed by an organization with an ever-watchful eye on their availability, latency, performance and capacity.1.

AIOps: Hype vs. Reality

What is AIOps? How does an AIOps platform help your observability practice? AIOps platforms analyze telemetry and events, and identify meaningful patterns that provide insights to support proactive responses. AIOps platforms have five characteristics:1 The above is Gartner’s definition and is part of the Gartner® “Market Guide for AIOps Platforms.” The Gartner definition is also aligned with our view.

StackPod: Jujhar Singh of Thoughtworks on Why Technology Is Always About People

A few episodes ago, we talked with fellow podcaster and tech evangelist Dotan Horovits. During that episode, Dotan shared that he wrote a blog post with Jujhar Singh called “How Much Observability Is Enough?” which is definitely a recommended read if you’re implementing observability and feeling overwhelmed. After reading this article, we were eager to invite Jujhar to the StackPod as well, to dive into this topic a bit more.

Research Report Observability at the Speed of Innovation 2022

IT innovation is happening at a record pace. With today’s complexities, you need deep insights into your IT environment—more than traditional monitoring tools can provide. Enter modern observability, a critical application. Observability moves beyond monitoring to help teams understand what is actually happening in the system by bringing together and correlating information from all layers of your IT stack. Observability gives teams deeper, more actionable insights into both the state of a system and the reasons for its behavior.