Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Accelerate incident investigations with Log Anomaly Detection

Modern DevOps teams that run dynamic, ephemeral environments (e.g., serverless) often struggle to keep up with the ever-increasing volume of logs, making it even more difficult to ensure that engineers can effectively troubleshoot incidents. During an incident, the trial-and-error process of finding and confirming which logs are relevant to your investigation can be time consuming and laborious. This results in employee frustration, degraded performance for customers, and lost revenue.

Lights, Camera, Action: Introducing The Fellowship of the Stream

Last week, an article from SiliconAngle came out detailing the challenges facing cybersecurity professionals. Companies are in desperate need of solutions to deal with cloud-native applications that exist in fast-paced environments. The security and IT teams monitoring these applications need scalable and flexible solutions that drive actionable insights. That’s why we built Cribl Stream.

Tackling Your Carbon Footprint with the Sustainability Toolkit for Splunk

Simple questions can be overwhelming and not knowing the answer after a mouse click is no longer an option: Sustainability is top of mind for organizations across all verticals and Splunk can help with the power of data. Our upcoming Sustainability Toolkit based on the Splunk platform equips organizations with capabilities to gain deep insights into their carbon footprint and as such empowers them to take the necessary actions towards their carbon neutrality goals.

Use Service Design in Operations Management to Enhance Security

As an IT operations manager, you spend a lot of your time mitigating service outages and service level risks. You worked diligently to get the right people, products, processes, and partners in place to meet your goals. You managed to ensure continued uptime. You’ve reduced the number of tickets and the cost per ticket. And for your efforts, you’re rewarded with managing your company’s cybersecurity program. The problem? You’re not a security specialist.

How to Use OpenTelemetry to Troubleshoot a Serverless Environment with StackState

Losing track of communication between applications or code has become a problem with the tech world growing more into supporting Serverless cloud architectures and allowing the developer to maintain, upgrade and update these services. One might say that services and code are becoming more loosely coupled, allowing code to run and execute in silos. Let's take an AWS Lambda function as an example.

Slack's New Metrics Storage Engine Challenges Prometheus

Metrics storage engines must be specially engineered to accommodate the quirks of metrics time-series data. Prometheus is probably the most popular metrics storage engine today, powering numerous services including our own Logz.io Infrastructure Monitoring. But Prometheus was not enough for Slack given their web-scale operation. They set out to design a new storage engine that can yield 10x more write throughput, and 3x more read throughput than Prometheus! In February 2022 Suman Karumuri, Sr.

Elasticsearch Release: Roundup of Change in Version 8.1.0

Elastic released a major version of its platform on February 10, 2022. Version 8.0.0 is the latest major version. There has already been a new minor release to version 8.1.0, and there are anticipated minor and patch releases coming as Elastic rolls out new features and fixes. The latest release is the first significant revision since April 2019, when version 7.0.0 was generally available. Users can find a complete list of release notes on the Elastic website.

The Bird is the Word: Getting Up and Running Fast on Humio, by Crowdstrike

I’ve been in the log data analytics space for years, and I have loved seeing the technology and methodologies change and evolve. One of my favorite changes has been the emergence of index-less solutions, and Humio has a great solution here. If you haven’t heard of Humio, you should check out their index-less log management solution for yourself (free up to 16 GB/day too).

New in Grafana Loki 2.5: Faster queries, more log sources, so long S3 rate limits, and more!

I’m very excited to tell you all about the latest Grafana Loki installment, 2.5! A huge amount of work, nearly 500 PRs, has gone into Loki between v2.4 and now. The major themes for this release are improved performance, continuing ease of operations, and more ways to ingest your logs. I usually find myself the most excited about performance improvements, so let’s start there.

How to Identify Memory Leaks

You may not be familiar with thinking about the memory usage of your applications as a software developer. Memory is plentiful and usually relatively fast in today's development world. Likely, the programming language you're using doesn't require you to allocate or free memory on your own. However, this does not mean you are safe from memory leaks. Memory leaks can occur in any application written in any language. Sure, older or "near to the metal" languages like C or C++ have more of them.