Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

A practical guide to Logstash

Logstash is a tool to collect, process, and forward events and log messages and this Logstash tutorial will get you started quickly. It was created by Jordan Sissel who, with a background in operations and system administration, found himself constantly managing huge volumes of log data that really needed a centralized system to aggregate and manage them. Logstash was born under this premise and in 2013 Sissel teamed up with Elasticsearch.

Tutorial: Elasticsearch Snapshot Lifecycle Management (SLM)

Let’s face it, nothing is perfect. The better we architect our systems, though, the more near-perfect they become. But even so, someday, something is likely to go wrong, despite our best effort. Part of preparing for the unexpected is regularly backing up our data to help us recover from eventual failures and this tutorial explains how to use the Elasticsearch Snapshot feature to automatically backup important data.

CI/CD Detection Engineering: Splunk's Security Content, Part 1

It's been a while since I've had the opportunity to take a break, come up for air, and write a blog for some of the amazing work the Splunk Threat Research team has done. We have kept busy by shipping new detections under security-content (via Splunk ES Content Update and our API). Also, we have improved the Attack Range project to allow us to test detections described as test unit files.

Gardener, SAP's Kubernetes-as-a-service open source project, is moving its logging stack to Loki

Kristian Zhelyazkov is a developer at SAP working on Gardener, the SAP-driven Kubernetes-as-a-service open source project. In this guest blog post, he explains why the project is moving its logging stack to Loki.

Prometheus vs Nagios

Production environment stability and high availability are the holy grail of every SaaS company. R&D organizations put a lot of effort into achieving these goals by implementing different monitoring and alert methodologies and by utilizing a variety of systems and tools. Mean-time-to-detect (MTTD) and mean-time-to-repair (MTTR) are two crucial KPIs that help R&D management personnel determine the efficiency and proficiency of their teams’ responses to production incidents.

Managing Docker Logs with ELK and Fluentd

This article provides an overview of managing and analyzing Docker logs and explores some of the complexities that may arise when looking through the log data. We will go through the default logging approach, as well as look at some more advanced configurations that will make diagnosing issues in your Docker-hosted applications much easier going forward.

Server Monitoring and Alerts - Getting Past Common Obstacles

Keeping a server running optimally on a consistent basis involves managing multiple system elements simultaneously. Automated scripts and specialized software can handle the tasks your server needs to complete on a daily basis—but when one of these experiences an error, it can throw the entire system off.

Logging for DevSecOps

Logging is probably not the first item to come to mind when most of us think about DevSecOps, a term that refers to the integration of security into DevOps processes, but it should be. Logging and log management play a critical role in helping to put DevSecOps principles into practice by ensuring that developers, IT operations staff, and security teams have the visibility and communication pipelines they need to prioritize security at all stages of the DevOps delivery cycle.