Operations | Monitoring | ITSM | DevOps | Cloud

Logging

The latest News and Information on Log Management, Log Analytics and related technologies.

Elastic Enterprise Search 8.2: Relevance controls for Elasticsearch

Elastic Enterprise Search 8.2 introduces new ways to ingest, search, and monitor data, giving developers the productivity benefits of using out-of-the-box capabilities along with the power and flexibility inherent in Elastic Stack tools. Operators also gain even more transparency for managing search experiences and observing search performance. For a visual walkthrough of some of the key capabilities in 8.2, check out the latest installment of What’s new in Enterprise Search on YouTube.

How to Monitor Riak Metrics with OpenTelemetry

observIQ’s OpenTelemetry members contributed Riak metric monitoring support to OpenTelemetry! You can now monitor your Riak agent performance with OpenTelemetry, and deploy simply with the oIQ OpenTelemetry Collector. You can add the Riak metric receiver to any OpenTelemetry collector. This post demonstrates a configuration for shipping metrics to Google Cloud Operations with OpenTelemetry components.

Proactive Monitoring vs. Reactive Monitoring

Monitoring is a fundamental pillar of modern software development. With the advent of modern software architectures like microservices, the demand for high-performance monitoring and alerting shifted from useful to mandatory. Combine this with an average outage cost of $5,600 per minute, and you’ve got a compelling case for investing in your monitoring capability.

How to Monitor Redis with OpenTelemetry

We’re excited to announce that we’ve recently contributed Redis monitoring support to the OpenTelemetry collector. You can check it out here! You can utilize this receiver in conjunction with any OTel collector: including the contrib collector, the observIQ’s distribution of the collector, as well as Google’s Ops Agent, as a few examples.

Get More Value From Your Logs Without Compromising Costs

Everyone at LogDNA is (unsurprisingly) obsessed with the power of log data. It is the single source of truth for what is happening in your environment and, when used correctly, provides the insights needed to deliver better experiences. Now more than ever, people across various teams understand the value of having easy access to log data within key workflows.

3 Key Benefits to Web Log Analysis

Whether it’s Apache, Nginx, ILS, or anything else, web servers are at the core of online services, and web log analysis can reveal a treasure trove of information. These logs may be hidden away in many files on disk, split by HTTP status code, timestamp, or agent, among other possibilities. Web access logs are typically analyzed to troubleshoot operational issues, but there is so much more insight that you can draw from this data, from SEO to user experience.

High-Performance Javascript in Stream - Why the Function in Your Filter Matters

Being a Cribl Pack author, I frequently receive questions related to why I chose to implement a certain functionality inside my Packs the way I did. A few lives ago, I worked for a Fortune 250 oil & gas company where I managed our SIEM environment. We didn’t have much in terms of system resources, so we needed to make everything run as efficiently as possible. (Maybe that’s where I get my love for performance from?)

Apache Kafka Consumer Lag Monitoring

The world lives by processing the data. Humans process the data – each sound we hear, each picture we see – everything is data for our brain. The same goes for modern applications and algorithms – the data is the fuel that allows them to function and provide useful features. Even though such thinking is not new, what is new in recent years is the requirement of near-real-time processing of large quantities of events processed by our systems.

Kubernetes Incident Response Best Practices

Inevitably, organizations that use technology (regardless of the extent) will have something, somewhere, go wrong. The key to a successful organization is to have the tools and processes in place to handle these incidents and get systems restored in a repeatable and reliable way in as little time as possible.