Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

DNS Outages Expose Hidden Risks. Edwin AI Finds Them Faster.

The recent AWS outage exposed how fragile the internet remains. Amazon traced the hours-long disruption to a DNS error—a small failure with massive reach. For most organizations, DNS operates quietly in the background. When it fails, every digital service connected to it stops. One of LogicMonitor’s valued customers, IG Group, faced a similar event less than ten hours after enabling Edwin AI.

How to Use the Power BI Desktop InfluxDB 3 ODBC Connector

The challenge of storing, processing, and alerting on your time series data is only part of the battle when it comes to deriving value from time-stamped data. While InfluxDB 3 addresses those hurdles with the database and Python processing engine, data analytics teams still need to be able to visualize their data and build dashboards to complete the time series story.

OpenTelemetry Spans Explained: Deconstructing Distributed Tracing

In a microservices architecture, a single user request can pass through multiple services before completing. When performance drops or an error occurs, tracing that journey is the only way to locate the source. Distributed tracing provides that visibility. At its core are OpenTelemetry Spans — units of work that capture what each service does during a request.

Why Your APM Needs Observability - Metrics, Logs, and Traces Explained

Modern software applications are increasingly complex. Microservices, cloud infrastructure, and distributed architectures make it challenging for developers, DevOps engineers, and SREs to maintain high performance and a seamless user experience. Traditional Application Performance Monitoring (APM) provides critical insights into how applications perform, but alone, it often leaves blind spots when it comes to diagnosing issues or understanding the full system behavior.

Meet Olly - The Coralogix AI Observability Agent (Demo)

Olly is Coralogix’s AI-native observability agent that makes observability data fast, accessible, and actionable—for everyone. Traditionally, teams have spent valuable time piecing together dashboards and writing queries to troubleshoot issues. Olly changes that by letting you ask real questions in natural language and delivering instant, intelligent answers from across your logs, metrics, and traces.

AI Agent for Cloud Cost Optimization: From Blind Spots to Smarter Spend

Cloud has become the backbone of digital enterprises, but managing its cost footprint is proving increasingly difficult. With multiple providers, diverse pricing models, and ever-changing workloads, organizations often find themselves facing spend leakage and unanticipated overruns. The stakes are high—not only in terms of IT budgets but also in ensuring cloud resources deliver maximum business value.

Grafana and Grafana Cloud release cycle: An end-of-year update

With the end of the year fast approaching, we want to let you know about some important dates for our upcoming release freezes. Our annual release freeze helps ensure stability for everyone during the holiday season, which is a critical time for many of our customers. This pause helps us protect our on-call teams and maintain a smooth experience for you.

The next evolution of WebPageTest has arrived, and it's a game-changer

Now fully integrated into Catchpoint’s Internet Performance Monitoring (IPM) platform, WebPageTest is no longer just a testing tool; it’s your full-stack performance command center. From AI-powered insights to automation and Smartboards, the new WebPageTest gives digital experience teams everything they need to move beyond page speed and master end-to-end performance. Test smarter, detect faster, and optimize every layer of performance with a unified, AI-powered platform built for experts.