Operations | Monitoring | ITSM | DevOps | Cloud

Blog

Why Enterprises should push for Container Adoption in 2020?

Enterprises that have been running their own data centers for a number of years are skeptical of the benefits associated with cloud. One of the considerations for many enterprises is to be able to build modern applications such that the dependency on a particular cloud stack is minimal, or the interfaces that are depending on the specific cloud are abstracted well.

Implementing Geolocation with Graylog Pipelines

Geolocation can be automatically built into the Graylog platform by using the "GeoIP Resolver" plugin with a MaxMind database. However, you can further improve your ability to extract meaningful and useful data by leveraging the functionality of pipelines and lookup tables. In fact, these powerful features allow you to do much more than the basic plugin.

How to create an on-call schedule that doesn't suck.

A lot of tech companies struggle with creating an effective and efficient on-call schedule internally for their product and service, this results in much longer downtimes when something goes wrong. They often over-burden their team members with repeated on-call duty which results in team member fatigue. Here’s how to create an on-call schedule that your team might love.

Detecting CVE-2020-0601 Exploitation Attempts With Wire & Log Data

Editor’s note: CVE-2020-0601, unsurprisingly, has created a great deal of interest and concern. There is so much going on that we could not adequately provide a full accounting in a single blog post! This post focuses on detection of the vulnerability based on network logs, specifically Zeek as well as Endpoint. If you are collecting vulnerability scan data and need to keep an eye on your inventory of systems that are at risk, then check out Anthony Perez’s blog.

The Future of Cortex: Into the Next Decade

The Cortex project, a horizontally scalable Prometheus implementation and CNCF project, is more than three years old and shows no sign of slowing down. Right now, there are a lot of things going on in Cortex, but sometimes it’s not clear why we’re doing them. So I want to provide some clarity for both the Cortex community – and the wider Prometheus community – regarding our intentions, especially with regards to the Thanos Project.

LogicMonitor and Unomaly can pre-empt business problems with AIOps

Curious about AIOps these days? You’re not alone. AIOps (Artificial Intelligence for IT Operations) is all about analyzing and automating your IT operations using artificial intelligence and machine learning algorithms. These operations include end-to-end workflows that bring monitoring, analytics, incident management, and automation systems together with a common goal of optimizing and automating operational tasks.

Observability vs Monitoring

Observability is a hot Subject right now, stirring a great deal of debate among IT admins. This report brings some clarity and will shed some light on the topic – “What is the difference between monitoring & observability?”. Enterprise IT is complex as IT infrastructure solutions are delivered from enormous datacenters located at remote locations.

Automating Sentry Releases with CircleCI

Continuous integration tools like CircleCI let developers automate builds and tests, so that teams can merge changes into their codebase quickly and frequently. In this article, we’ll take a look at how to combine Sentry’s command line interface with CircleCI to automatically create Sentry releases. This will unlock some of our best features, like identifying suspect commits that likely introduced new errors, applying source maps to see the original source code within Sentry, and more.

Introducing Netdata's step-by-step tutorial

Health monitoring and performance troubleshooting aren’t easy. That’s exactly why we’re building Netdata, to democratize monitoring and make it accessible to anyone interested in learning more about their systems and applications. Of course, teaching a complicated topic isn’t easy either. Until recently, the only resource to help new users after installation has been our getting started guide.

What Are Service-Level Objectives? Lessons Learned

Service Level Objectives, or SLOs, are an internal goal for the essential metrics of a service, such as uptime or response speed. We’re probably familiar with this definition, but what is the value of setting these goals? We’ll take a look at SLOs as both a powerful safety net and a tool to inform the allocation of engineering resources, while also considering the cultural learnings of SLO adoption.