Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Log Management, Log Analytics and related technologies.

Creating Re-Usable Components for Telemetry Pipelines

One challenge for the widespread adoption of telemetry pipelines for SRE teams within an organization is knowing where to start when building a pipeline. Faced with a wide assortment of sources, processors, and destinations, setting up a telemetry pipeline can seem like trying to build a Lego set without any instructions. The solution is to provide teams with pre-defined components that provide specific functionality, that they can then use to build pipelines that meet their own requirements.

Creating In-Stream Alerts for Telemetry Data

Alerts that you receive from your observability tool are based on conditions that existed seconds to minutes in the past, because the alert is only triggered after the data has been indexed within the tool. This means that your ability to take timely action in response to the condition is significantly limited, and often your window of opportunity to react is past by the time you receive the alert.

The Layers, Not Pillars, of Observability

Remember the Tabs vs. Spaces arguments? It seems that observability has grown up enough that we are arguing over which signals are the “best” signals for observability. Often referred to as the Pillars of Observability, Metrics, Logs, and Traces (sometimes adding Events for MELT) each provide a unique perspective on a system. What happens when we change our perspective from finding the “best” telemetry format to finding the telemetry that aligns with the problems we need to solve?

A Next-Gen Partnership with CrowdStrike's Falcon Next-Gen SIEM

In an increasingly digital world, organizations face complex challenges in managing their security data that’s growing at a relentless pace. With the rapid growth of cyber assets and the ever-present threat of sophisticated attacks, legacy security tools often struggle to keep up.

What's Chaos Monkey? Its Role in Modern Testing

Chaos Monkey is an open-source tool. Its primary use is to check system reliability against random instance failures. Chaos Monkey follows the testing concept of chaos engineering, which prepares networked systems for resilience against random and unpredictable chaotic conditions. Let’s take a deeper look.

Put Your Issue Detection and Response on Fast-Forward With GenAI

Most engineers will tell you this: Troubleshooting today feels like trying to find your way out of a wild jungle, in the middle of a storm, at night, while a countdown clock is running. In other words, it’s ambiguous, nerve-racking, and plain difficult. But should this be the norm?