Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

AWS Prometheus: Production Patterns That Help You Scale

You've got Prometheus running in one cluster — maybe a dev environment, a single EKS cluster, or a proof-of-concept setup. The configuration is straightforward: node_exporter on a few EC2 instances, some service discovery for pods, and a single Prometheus server scraping everything. Storage is local, retention is 15 days, and you can keep all the default recording rules without worrying about costs.

Instrumenting the Node.js event loop with eBPF

Recently, I was testing Coroot’s AI Root Cause Analysis on failure scenarios from the OpenTelemetry demo. One of them, loadgeneratorFloodHomepage, simulates a flood of excessive requests. As expected, it caused a latency degradation across the stack. Coroot’s RCA highlighted how the latency cascaded through all dependent services. At the same time, we noticed a moderate increase in CPU usage for the frontend service and the node itself.

Synthetic Monitoring Frequency: Best Practices & Examples

Synthetic monitoring is, at its core, about visibility. It’s the practice of probing your systems from the outside to see what a user would see. But there’s a hidden parameter that determines whether those probes actually deliver value: frequency. How often you run checks is more than a technical configuration—it’s a strategic choice that ripples through detection speed, operational noise, and even your team’s credibility.

SquaredUp Cloud + Dashboard Server

SquaredUp Dashboard Server (DS) and SquaredUp Cloud both deliver cutting-edge data visualization for IT and engineering teams. The two products can be used independently, or together for complete operational visibility. This article explores how SquaredUp DS and Cloud differ, when to use each, and how they work together.

Solve Microsoft Teams Performance Troubles Before They Hit Your Inbox

Who Solves It Faster? Microsoft Native Tools vs. Vantage DX Tickets piling up. Execs on your case. Teams acting up. Microsoft’s tools only show part of the story—leaving you stuck reacting. Watch our pros do a no-fluff, side-by-side showdown: Microsoft Native Tools vs. Martello Vantage DX. Watch them tackle real Teams issues and see who finds and fixes the problem faster. What Attendees will learn.

Elastic Cloud Serverless on Google Cloud doubles region availability

We’re pleased to announce the availability of Elastic Cloud Serverless on Google Cloud in three new regions: This doubles the number of available regions on Google Cloud and dramatically increases serverless deployment options in the US. Elastic Cloud Serverless provides the fastest way to start and scale observability, security, and search solutions without managing infrastructure.

Chaos to Choreography: How To Automate IT Operations with Nexthink Flow

Taylor proved it—turning license headaches, VPN chaos, SCCM continuity and patch pain into a smooth, confident performance.⁠⁠Here's how Taylor did it.⁠ In the end, IT isn’t just about fixing—it’s about flowing, scaling, and making work effortless.⁠Request a demo today.

How Nexthink Enables Data-Driven Software License Reclamation

This was what Sarah was looking to solve. ⁠Managing software licenses isn’t just tracking installs—it’s about uncovering hidden usage and reclaiming wasted spend. When Sarah faced $12M in software costs, scattered licenses, and zero visibility, she needed a better way. With Nexthink, she gained real-time insights, smart user nudges, and automated reclamation. ⁠The result?

OpenTelemetry Logs - A Complete Introduction & Implementation

OpenTelemetry is a Cloud Native Computing Foundation(CNCF) incubating project aimed at standardizing the way we instrument applications for generating telemetry data(logs, metrics, and traces). OpenTelemetry aims to provide a vendor-agnostic observability framework that provides a set of tools, APIs, and SDKs to instrument applications.