Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Completing the Kubernetes Monitoring Puzzle

Kubernetes has changed the way many organizations approach the deployment of their applications. But despite its benefits, the additional layers of abstraction and reams of data can cause complexity around Kubernetes monitoring. We’ve seen so much of these challenges borne out in the results of the 2024 Observability Pulse survey. In the survey report, 36% of respondents say Kubernetes poses a challenge, and just 10% of organizations say they have full observability into their environments.

Data Chaos MUST Be Curbed, but How?

My introduction to the world of data science was writing anomaly detection for a SIEM that catered to banks and credit unions. Some of these places were running on 50-year-old IBM core banking servers — meaning that someone trying to turn off a light in a server room could take down an entire bank with a literal flip of the wrong switch. While some companies take their time updating infrastructure, others still embody the move-fast-and-break-things philosophy of the early dot-com era giants.

Continual Learning in AI: How It Works & Why AI Needs It

Like humans, machines need to continually learn from non-stationary information streams. While this is a natural skill for humans, it’s challenging for neural networks-based AI machines. One inherent problem in artificial neural networks is the phenomenon of catastrophic forgetting. Deep learning researchers are working extensively to solve this problem in their pursuit of AI agents that can continually learn like humans.

Advantages of an AI-Powered Observability Pipeline

The expenses associated with collecting, storing, indexing, and analyzing data have become a considerable challenge for organizations. This data is growing as fast as 35% a year, multiplying the problems. This surge in data comes with a corresponding rise in infrastructure costs. These costs often force organizations to make decisions about what data they can afford to analyze, which tools they must use, and how and where to store data for long-term retention.

Splunk second thoughts? It's time for the cloud-native alternative

Back in September when Cisco announced they were acquiring Splunk, we explained how the market was consolidating with Sumo Logic ahead of the pack, challenging traditional vendors with our cloud-native platform. Now that the deal is complete and Splunk is officially a Cisco company, we’re hearing from more Splunk customers who are considering their options.

What is Log Analytics?

There is observation then there’s analysis. Log Analytics falls under the latter category. Observation and analysis are not mutually exclusive; one builds upon the other. Similarly, Log analytics advances beyond simple log monitoring, enabling observability teams to identify trends and irregularities throughout your enterprise. To demystify what is Log Analytics, let’s first have a look at the definition.

Elastic Search 8.13: Simplifying embedding and ranking for developers

Elastic Search 8.13 extends the capabilities that enable developers to use artificial intelligence and machine learning models to create fast and elevated search experiences. Integrated with Apache Lucene 9.10, measured vector search performance has exceeded 2x in benchmarks, extending the sophistication of searches that can be performed in near real time.

The Ultimate CPU Alert - Reloaded, Again!

It’s been nearly ten years since “The Ultimate CPU Alert – Reloaded” and its Linux version were shared with the SolarWinds community. At that time, managing CPU data from 11,000 nodes, with updates every five minutes to a central MSSQL database, was a significant challenge. The goal was to develop alerting logic to identify when a server was experiencing high CPU usage accurately.

Turning Logs into Metrics with OpenTelemetry and BindPlane OP

Turning logs into metrics isn’t a new concept. A version of this functionality is implemented in most agents, visualization tools, and backends. It’s everywhere because converting logs to metrics has many practical applications and is one of the fundamental mechanisms for controlling log volume in a telemetry pipeline. In this post, I’ll briefly overview log-based metrics, explain why they matter, and provide examples of how to build them using OpenTelemetry and BindPlane OP.