Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Data Collection Strategies for Infrastructure Monitoring - Troubleshooting Specifics

Monitoring and troubleshooting; unfortunately, these terms are still used interchangeably, which can lead to misunderstandings about data collection strategies. In this article we aim to clarify some important definitions, processes, and common data collection strategies for monitoring solutions. We will specify the limitations of the described strategies, as well as key benefits which can potentially be also used for troubleshooting needs.

How Netdata's Machine Learning works

Following on from the recent launch of our Anomaly Advisor feature, and in keeping with our approach to machine learning, here is a detailed Python notebook outlining exactly how the machine learning powering the Anomaly Advisor actually works under the hood. Or if you’d rather watch a video walkthrough of the notebook then check out below. Try it for yourself, get started by signing in to Netdata and connecting a node.

Anomaly rate in every chart

A month ago, we introduced unsupervised ML & Anomaly Detection in Netdata, the Anomaly Advisor. Today, we’re happy to announce that we’re bringing anomaly rates to every chart in Netdata Cloud. Anomaly information is no longer limited to the Anomalies tab and will be accessible to you from the Overview and Single Node View tabs as well. This will make your troubleshooting journey easier, as you will have the anomaly rates for any metric available with a single click.

Metric Correlations on the Agent

As of v1.35.0 the Netdata Agent can now run Metric Correlations (MC) itself. This means that, for nodes with MC enabled, the Metric Correlations feature just got a whole lot faster! The Netdata Metric Correlations feature uses a Two Sample Kolmogorov-Smirnov test to look for which metrics have a significant distributional change around a highlighted window of interest.

Monitoring Ubuntu 20.04 and Activating ML with Netdata

Sometimes a hat is just a hat, the truth is just the truth, and the clearly most popular example of a category is plain to see. In this case, Ubuntu is the most popular Linux distribution currently available. With the operating system’s superior popularity also comes an amazing amount of community support.

Test Driving Machine Learning (ML) Anomaly Advisor

Netdata’s new Anomaly Advisor feature lets you quickly identify potentially anomalous metrics during a particular timeline of interest. This results in considerably speeding up your troubleshooting workflow and saving valuable time when faced with an outage or issue you are trying to root cause.

Introducing Anomaly Advisor - Unsupervised Anomaly Detection in Netdata

Today we are excited to launch one of our flagship ML assisted troubleshooting features in Netdata – the Anomaly Advisor. The Anomaly Advisor builds on earlier work to introduce unsupervised anomaly detection capabilities into the Netdata Agent from v1.32.0 onwards.

Kubernetes Throttling Doesn't Have To Suck. Let Us Help!

In the Kubernetes (K8s) community, there is a huge misconception about CPU allocation and utilization. Even highly experienced SREs find themselves struggling with the way Kubernetes allocates CPU resources, leading to misconfigured CPU allocations and extremely negative outcomes. For starters, this results in significant quality degradation on important service components, introduced by behind-the-scenes CPU limiting (or throttling).

Troubleshooting Alerts the Right Way: As a Team

At Netdata, we love two things more than anything else: Our goal is to make troubleshooting and monitoring as seamless as possible with the open-source Agent. This includes giving you pre-configured alerts so that you get notified immediately when a disruption occurs. The Netdata Agent comes with over 250 pre-configured and optimized alerts.