Operations | Monitoring | ITSM | DevOps | Cloud

Analytics

Parsing Log Files With Graylog - Ultimate Guide

‍ Log file parsing is the process of analyzing log file data and breaking it down into logical syntactic components. In simple words - you’re extracting meaningful data from logs that can be measured in thousands of lines. There are multiple ways to perform log file parsing: you can write a custom parser or use parsing tools and/or software.

Managing dynamic data flows across Elasticsearch clusters

Massively scaling free-text search has always been the holy grail in big data. Many software firms now face the burgeoning challenge of searching through previously untapped data sources and the current trend is far surpassing the petabyte scale. Here at LogDNA we manage free-text search for thousands of customers with distinct traffic profiles across a multitude of Elasticsearch clusters.

Log Management and Graylog Alerts - Keeping Track of Events in Real-Time

Every log management solution out there has its own alerting feature. Alerts are a critical component of every logging tool. They can tell you whether an event is something you want to check out rather than just normal everyday activity you want to ignore. Graylog’s simplified interface is incredibly accessible to assist you with all the information you need in real-time, yet scalable enough to never compromise the level of detail provided.

Meetup Paris #39 : The Five Foundations of Elasticsearch Performance

In this talk, we'll look at five lenses through which one can view the performance of an Elasticsearch clusters. Taking each in turn, attendees will come away with a set of principles and concerns through which they can monitor and understand the health and performance of their production Elasticsearch systems.

Slack Loses $8M to Outages

On July 22, 2019, Slack was in the middle of deploying an update to their desktop app. The update was supposed to decrease memory consumption and increase load time, but instead the company suffered a significant, widespread outage on a global scale. After approximately 40 minutes of downtime, the service was back up. But in the meantime, the company whose motto is ‘where work happens’ essentially stopped working.