Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Apache ActiveMQ High Availability Architecture: The Complete 2026 Guide

The most common Apache ActiveMQ high availability mistake is not a configuration error; it is a false assumption. Teams deploy two broker instances, point clients at both with a comma-separated URL, and label the topology "HA." Then the primary crashes, the secondary does not have the message state, and clients start throwing exceptions while the ops team scrambles.

How to Exclude Health Check Endpoints from Python OTel Traces

Health check endpoints generate thousands of identical, useless spans per day. Here are two production-ready approaches to filter them from your Python OTel traces — and the correctness trap most implementations miss. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

last9-genai: Closing the Conversation Gap in LLM Observability

OpenTelemetry's GenAI instrumentation gives you spans and token counts. It does not give you conversations, workflow cost rollups, or prompts visible in your dashboard. last9-genai is an OTel extension that fills those three gaps — without replacing your existing observability stack. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

State of Observability in Financial Services 2026: From implementation to business impact

The demands on financial services companies are intensifying rapidly. They must not only deliver seamless system performance but also control costs, secure sensitive data, and maximize the value of their observability investments. To navigate these converging pressures, leaders are evolving their approach to system monitoring and telemetry. The 2026 State of Observability in Financial Services research report reveals a fundamental shift in how organizations manage their digital infrastructure.

Digitate is Positioned as a Leader in the IDC MarketScape: Worldwide AIOps 2026 Vendor Assessment

IT operations are in a new era – teams are expected to deliver always-on reliability, absorb constant change, manage runaway telemetry volumes, and still prove business impact. The IDC MarketScape: Worldwide AIOps 2026 Vendor Assessment (doc, March 2026) offers ITOps leaders a valuable lens on the AIOps landscape and the providers shaping what comes next.

Icinga 2 Meets OpenTelemetry: Native Metrics Export in v2.16

The OTLPMetricsWriter is a new Icinga 2 feature available since v2.16 that exports check plugin performance data as OpenTelemetry-compliant metrics via the OTLP HTTP protocol. With a single configuration object, it connects Icinga 2 to any OTLP-compatible backend like Prometheus, Grafana Mimir, Datadog, Elasticsearch, VictoriaMetrics, and more.

Get observability in the terminal, for you and your agents, with the gcx CLI tool

The way you write code is changing, which means the way you observe your systems and respond to issues needs to change, too. Engineers today spend much of their day working via command line, as agentic tools like Cursor and Claude Code have become highly effective at handling many day-to-day engineering tasks. This greatly accelerates code generation, but it doesn't solve for the context switching that comes when you have to jump into another tool that's not part of this new, faster workflow.

Secure performance testing at scale: Introducing secrets management for Grafana Cloud k6

To simulate real user behavior, performance tests often rely on API keys, tokens, or credentials to interact with real systems. But as your testing suite grows, this sensitive data can start to sprawl across scripts, configs, and environments, increasing the risk of exposure and making tests harder to manage and maintain. To address this challenge, we’re rolling out secrets management for Grafana Cloud k6, the fully managed performance testing platform powered by k6 OSS.

Why Runtime Visualization Is the Missing Link in Teaching Real-Time Systems

Guest blog by Florent Goutailler, Associate Professor, Télécom Saint-Etienne, France Teaching real-time embedded systems has always involved a fundamental challenge: the most critical behaviors – task scheduling, timing, and concurrency – are largely invisible at runtime. When students begin working with a real-time operating system such as FreeRTOS, they are introduced to concepts like scheduling, task prioritization, semaphores, and inter-task communication.

Service-Centric Observability as the Control Layer

If distributed architectures have altered how systems degrade, then the way organizations model operational must evolve accordingly. Threshold monitoring evaluates individual metrics. Correlation clusters related alerts. Neither, on its own, explains how instability in one component alters exposure across an interconnected service landscape. In conversations at Nexus Live 2025, ScienceLogic’s annual customer conference, leaders described this distinction with clarity.