Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How does AI enhance search?

Explore how artificial intelligence enhances search engines through semantic understanding, vector embeddings, and contextual retrieval. Learn how AI-powered search delivers faster and more accurate results. Additional Resources: About Elastic Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale. Elastic’s solutions for search, observability, and security are built on the Elastic Search AI Platform — the development platform used by thousands of companies, including more than 50% of the Fortune 500.

The Spark Avengers Unite: Dispatches on the FUTURE of IT (w/ Matt, Moe & Denis)

Tom assembles the “Spark Avengers” for a deep dive into the most talked-about innovation in IT: Nexthink Spark, the personal AI agent for every employee. Joined by Moe Haidar, Denis Schertenleib and Matt Rose, the team unpacks how Spark evolved from early LLM experiments into an enterprise-ready, autonomous IT agent already delivering 70%+ first contact resolution. From printers and frozen cameras to complex root-cause analysis, Spark is transforming support from reactive to proactive.

Skills vs. MCP: You're probably reaching for the wrong one

Everyone is adding Model Context Protocol (MCP) servers to everything right now. And I get it. MCP is clean. It’s standardized. You write a server, expose some tools, and suddenly your LLM can query your log platform, pull a dashboard, and fire an alert. It feels like the right abstraction. But I’ve watched teams at serious companies burn weeks building MCP integrations for workflows that should have been skills, and build skills for things that genuinely needed MCP.

7 Real Ways to Modernize NetOps with Kentik AI Advisor

Kentik’s AI Advisor acts as a virtual network engineer, helping teams of all skill levels troubleshoot, manage, and optimize their infrastructure with unprecedented speed and context. We explore seven practical NetOps use cases, from rapid incident triage and capacity planning to upcoming live-device command support, that demonstrate how using AI as a collaborative teammate dramatically reduces manual investigative work.

Generating metrics from traces with cardinality control: A closer look at HyperLogLog in Tempo

While tracing is a critical component of any observability strategy, metrics — especially RED metrics (request rate, error rate, and duration) — are widely considered the gold standard for monitoring service health. Tempo, the open source, easy-to-use, and highly scalable distributed tracing backend, is well known in the OSS community for storing and querying traces. It can also, however, generate RED metrics directly from those traces using the optional metrics-generator component.

Use plain English to query your multi-cloud infrastructure in Resource Catalog

Modern cloud environments include thousands of resources across providers, teams, and accounts. Organizations need the ability to quickly locate the right resources so that they can manage resource compliance and troubleshoot issues. When engineers need to answer questions such as which databases are still on extended support or which storage buckets lack encryption, they often have to switch consoles, use provider-specific query languages, and know obscure version strings or configuration flags.

Public Sector Observability: Service Experience and Reliability Are Now Mission-Critical

Reliable digital services aren’t optional for public sector agencies. They’re essential to mission success. Across the U.S. public sector, service experience and reliability have moved from operational concerns to mission requirements. At a federal level, Executive Order 14058 makes improving service delivery and customer experience a federal priority, measured by real outcomes for the public. And for state and local governments, the bar is set by the private sector.