Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

2026 Observability & AI Outlook for IT Leaders

IT operations have outgrown the model they were built on. Enterprises now monitor tens of thousands of metrics, ingest terabytes of logs, and generate thousands of alerts daily, all while managing increasingly complex infrastructures that span on-prem data centers, multiple cloud environments, and emerging AI workloads. Yet despite all this telemetry, too many teams still learn about outages from customers before they see them in their tools.

OpenTelemetry Collector Contrib - A Hands-on Guide

As application systems grow more complex, it becomes ever more important to understand how services interact across distributed systems. Observability sheds light on the behavior of instrumented applications and the infrastructure they run on. This enables engineering teams to gain better track system health and prevent critical failures. OpenTelemetry (OTel) has standardized how we generate and transmit telemetry, and the OpenTelemetry Collector is the engine that processes and export this data.

Check out features we announced at AWS re:Invent in the latest episode of This Month in Datadog

Tune in for spotlights of Bits AI SRE, now generally available, and Datadog’s MCP Server, which connects AI agents to our platform by ingesting prompts and mapping them to Datadog resources and data. Plus, we cover how to: Search logs at petabyte scale in your own infrastructure with CloudPrem Break down costs drivers at the prefix level with Storage Management Create workflows that adapt to real-world complexity with Agent Builder Detect and block credential leaks with Secret Scanning.

How to Monitor Network Performance for Call Centers (Remote & On-Site)

A customer calls to place an urgent order. Your agent's VoIP line cuts out mid-sentence. Is it their home connection? Your network? The ISP? The phone system? You have no visibility, and by the time you figure it out, the customer's gone. This is the reality for modern call centers. Whether your agents work from a central office, from home, or split between both. Network issues don't just slow operations; they destroy customer experiences in real-time.

Your Opsgenie Migration is the Path to Proactive Reliability

With the Opsgenie end-of-life deadline (April 5, 2027) fast approaching, you're facing a critical choice: Do you truly need to move your dedicated Incident Response workflow into the complexity of Jira Service Management (JSM) or Compass? If your current process is a reactive treadmill—plagued by alert fatigue, lost context, and constant non-critical paging—the mandated move risks replacing one chaotic toolset with another complex ITSM solution. View this not as a burden, but as a chance to build a standardized, human-centric workflow that solves your biggest pain points and transforms your response from chaos to control.

From Zero Tickets to High-ROI: AI + DEX in 2026 (w/ Samuele Gantner and Vedant Sampath)

Kicking off 2026, Tim and Tom welcome Nexthink Chief Product Officer Samuele Gantner and first-time guest CTO Vedant Sampath for a candid “three pillars” deep-dive on enterprise AI. They explore how AI is reshaping product and engineering: new tooling, new development cycles, and the shift from deterministic software to probabilistic agents—plus the critical role of evals, benchmarks, guardrails, and performance. Then they unpack Nexthink’s three-pillar framework.

What is OTLP and How It Works Behind the Scenes

If you have worked with observability tools in the last decade, you have likely managed, and been burnt by, a fragmented collection of tools and libraries. Each observability signal required its own tool, data formats were incompatible and had little or no correlation. For example, log records would not link to traces, meaning you had to guess which traces led to which events. The OpenTelemetry Protocol (OTLP) solves this by decoupling how telemetry is generated from where it is analyzed.

Website Monitoring: What, Why, and Best Practices

In modern times where digital presence dictates business success, understanding website monitoring is no longer optional, whether you run an e-commerce store, SaaS platform, or enterprise website it’s a fundamental pillar of modern operations. Even a few minutes of website downtime can result in lost revenue, damaged credibility, and frustrated users.

2026 observability trends and predictions from Grafana Labs: unified, intelligent, and open

After a decade of dashboards, alerts, and ever-expanding telemetry pipelines, observability is changing. No longer just the domain of engineering, the most innovative organizations are extending observability to all areas of the business to better understand system behavior, emerging risks, and customer impact. At the same time, rising cloud costs and increasing complexity are forcing organizations to be more intentional about what they observe and why.