Operations | Monitoring | ITSM | DevOps | Cloud

Observability

The latest News and Information on Observabilty for complex systems and related technologies.

SolarWinds closes the market's hybrid IT observability gap, accelerating transformations for customers

The next generation of SolarWinds Observability delivers innovative and comprehensive full-stack visibility across all IT environments-on-premises, cloud, or hybrid-with flexible self-hosted and SaaS deployment options.

Using Honeycomb for Frontend Observability to Improve Honeycomb

Recently, we announced the launch of Honeycomb for Frontend Observability, our new solution that helps frontend developers move from traditional monitoring to observability. What this means in practice is that frontend developers are no longer limited to a metrics view of their app that can only be disaggregated in a few dimensions. Now, they can enjoy the full power of observability, where their app collects a broad set of data as traces to enable much richer analysis of the state of a web service.

Elevating SolarWinds Observability for Hybrid IT Environments

Exciting times here at SolarWinds. We’re uniting our Self-Hosted and SaaS observability offerings under a single umbrella, SolarWinds Observability, and announcing a host of enhancements that will allow us to go even further to meet our customers' hybrid IT needs. Let’s take a look at what’s in store.

Infrastructure and Observability as Code | An Introduction

In this video I will introduce you to the concept of Observability as Code and what that looks like in Splunk Observability Cloud. I’ll first discuss the issues you might encounter managing infrastructure manually, and then define Infrastructure as Code so that you have a better understanding of the motivation behind Observability as Code. We’ll briefly introduce Terraform and then I’ll discuss the benefits of implementing Observability as Code using Splunk’s Terraform provider in Splunk Observability Cloud.

Unified observability Maximize visibility & control of multi cloud environments

In today’s multi-cloud world, gaining real-time visibility across complex infrastructure is vital for business resilience and IT efficiency. However, traditional observability tools often fall short, leaving gaps in data collection and actionable insights. This is where unified observability comes in. Unified observability is Digitate’s unique approach, enabling organizations to monitor and control their business, applications, and infrastructure layers from a single pane of glass.

Getting Started with OpenTelemetry Visualization - A Practical Guide

OpenTelemetry is a Cloud Native Computing Foundation(CNCF) project aimed at standardizing the way we instrument applications for generating telemetry data(logs, metrics, and traces). However, OpenTelemetry does not provide storage and visualization for the collected telemetry data. For OpenTelemetry visualization, you need to use a backend that can ingest the collected data and provide a web UI to visualize it.

Refinery and EMA Sampling

Refinery is Honeycomb’s sampling proxy, which our largest customers use to improve the value they get from their telemetry. It has a variety of interesting samplers to choose from. One category of these is called dynamic sampling. It’s basically a technique for adjusting sample rates to account for the volume of incoming data—but doing so in a way that rare events get more priority than common events. Honeycomb’s query engine can compensate for sampling rates on a per-event basis.

Syncing PagerDuty Schedules to Slack Groups

We’ve posted before about how engineers on call at Honeycomb aren’t expected to do project work, and that whenever they’re not dealing with interruptions, they’re free to work on whatever will make the on-call experience better. However, all of our engineering rotations rely on hand-off meetings where they update the Slack groups with everyone who’s on call. During my last shift, a small problem kept causing friction for some of our incident management automation.

Investigate Performance issues with SLOs

When an alert goes off because a Service Level Objective (SLO) is in danger of violation, it comes with a lot of context about what has been going wrong and for how long. Then Honeycomb gives you tools to explore the where & why. Here, Martin Thwaites walks through an example of diagnosing slower performance. What service is the problem, and under what circumstances?