Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Turns any command into a plugin: check_rungrep

Imagine you have one more special thing to monitor. While our Icinga 2 can observe infrastructure of almost any size, it still needs a plugin for each kind of check. Unfortunately not every command meets the monitoring plugin API: exit code 0-3 (ok, warning, critical, unknown), performance data, etc. E.g. often programs exit with 1 in case of a fatal error, which is considered just a warning by Icinga.
Sponsored Post

Using observability tools for security monitoring and incident detection

Most security teams overlook a goldmine of data sitting right in their applications - crash reports and Real User Monitoring (RUM) telemetry. While engineers typically use these tools for performance tracking, they can reveal security incidents that might otherwise go unnoticed. Let's explore some practical ways to turn your observability data into a powerful security monitoring system. I'll help create a table of contents in the requested format based on the headings in the article.

Is your #observability always one step behind?

Guess what: It is designed to be like that! And the only way for you to get ahead of your operational challenges is to think differently. With Netdata, you get high-fidelity, ultra-detailed insights with unmatched granularity and cardinality and instant root cause analysis. See your infrastructure like never before! Get X-Ray Vision for your infrastructure!

Top 5 outages detected by StatusGator in February 2025

Service disruptions can happen at any time, affecting communication, productivity, and access to critical platforms. In February, several major services experienced outages, causing frustration for users worldwide. With its Early Warning Signals feature, StatusGator detected these issues in real time—often before official acknowledgments—helping users stay informed and prepared. Here are five notable outages from the past month.

Unlocking the Value of Network Observability

Today, a strong network forms the backbone of business success, making network visibility crucial. As modern networks continue their rapid evolution, it's essential to have an observability solution that is robust, resilient, and scalable. Teams need a solution that helps them enhance network performance and improve user experiences. They need a solution that enables them to confidently face current and future network operations challenges. Network Observability by Broadcom is that solution.

Why you shouldn't run tests sequentially

Frequently in support conversations and posts on Playwright forums, a problem has come up that’s a little bit hard to describe, but comes down to synchronous testing: developers writing a series of Playwright tests that operate on the assumption that one of the tests will either run first or run last, and perform the function of a setup and cleanup script.

DEM 101: Understanding and implementing digital experience monitoring

A faulty engine in a high-performance car; how disappointing can that be? The same is the case of a slow-loading, poorly performing webpage for any digital entity. All that the page can gain will be a group of tired and irritated customers and a loss of trust in the brand. Modern businesses need a fast, reliable, and seamless digital experience. Proactive monitoring of the user experience—understanding how users interact with all digital touchpoints—is vital.