Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Lumigo Launches AI Agent Observability

LLM-powered agents are reshaping software, but when they fail, troubleshooting is guesswork. Lumigo’s new AI Agent Observability, now in beta, gives you visibility into the entire lifecycle of your agents, from prompt to response to internal decision logic. Built for modern AI workloads, this feature is designed to help engineers monitor, debug, and optimize agents running on platforms like OpenAI, Anthropic, and open-source models.

Honeycomb Users Are Living in the Future, Part 1: Sampling

When we talk to new Honeycomb users, a few things stand out as sounding downright magical. Sometimes we’ll hear, “Wow, is that a new feature?” and we’ll say that no, it’s been like that for years. Clearly we need to get the word out! This is the first installment of a blog series I’ll be writing, covering areas of Honeycomb that elicit reactions of awe and disbelief from new users.

Observability's Moneyball Moment: How AI Is Changing the Game (Not Ending It)

‍ We're not witnessing the end of observability, we're witnessing its evolution into something far more powerful. The observability industry is having its Moneyball moment. Just like Billy Beane revolutionized baseball by using data analytics to compete with teams that had vastly larger budgets, observability is undergoing a fundamental transformation.

Integration Spotlight: Observo AI Supercharges SOCs on Elastic

Elastic is a go-to choice for organizations that want a powerful, flexible search and analytics engine without the cost overhead of traditional SIEM platforms. With its open-source foundation and customizable architecture, the Elastic (ELK) Stack—Elasticsearch, Logstash, and Kibana—has become a cornerstone for many modern observability and security workflows.

Not all monitoring sees what your users are seeing.

APM tools are great, but they have blind spots; they do not monitor from where your users actually are. There’s a gap between lab-perfect APM tests and real-world experience. There’s a lot that can degrade performance between your cloud environment and your users. If you’re not monitoring that path, you’re missing critical context.

Kubernetes Monitoring 101: 25 Tools And Must-Know Tips

The Kubernetes platform is the standard for orchestrating containerized applications. It’s ideal for large applications running on distributed instances. However, monitoring Kubernetes infrastructure can be notoriously challenging. This guide will cover Kubernetes monitoring in more detail, including what metrics to track to improve visibility and control over your K8s containers, apps, microservices, etc.

OpenTelemetry Collector: A Complete Guide [2025]

The OpenTelemetry Collector is a stand-alone service that acts as a powerful, vendor-neutral pipeline for your telemetry data. It can receive, process, and export logs, metrics, and traces, giving you full control over your observability data before it reaches a backend. This guide will provide a comprehensive overview of the OpenTelemetry Collector, its architecture, deployment patterns, and how to configure it for production use.

Notes from the Field: Seamless SSO 404s Impacting Citrix on Windows Server 2025

As a Citrix consultant, not every issue I troubleshoot is directly tied to Citrix, but many of them dramatically impact the end-user experience. This is one of those cases. A customer had begun testing Windows Server 2025 as Multi-Session hosts in their environment. The new servers were domain-joined and fully patched, and they expected a smooth experience with Office 365, Entra ID–backed apps, and cloud-based authentication. Everything had worked flawlessly on Server 2022.

Bringing Intelligence and Automation Together to Change the Shape of Work

The aspirational target state for a cognitive system is to “take responsibility” for a domain (e.g., an autonomous car). To reach that level of sophistication, the system must achieve high levels of maturity simultaneously along two dimensions: Reasoning ability and Automation ability.