Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Log Management Guide: Why You Should Track Logs?

IT experts agree that log management and monitoring is one of the most effective ways to keep IT infrastructure performing optimally. Logs play a vital role in improving performance, enhancing security, and detecting issues. But at the same time, a lot of people don’t use logs to the best of their ability. This guide will not only introduce you to log management but also reveal which logs to track and what information they are giving to you.

Windows Event Log Best Practices for Operations Teams

The Windows Event log is an essential tool for administrators to investigate and diagnose potential system issues, but it can also be a daunting task to gain real value and separate useful log entries from noisy non-essential activity. Depending on the level of logging that can be useful, Windows events can span system issues, application-specific issues, and also dive into security type issues around unauthorized access, login failures, and unusual behavior.

Development Environment Observability with Sentry

At Sentry, we’re always looking for innovative ways to dogfood our product. Over the last year we added Sentry’s error monitoring to our developer environment so that we could better understand the health of it. In this blog post I’m going to touch on how fragile local development environments can be, how we brought observability into what’s happening by introducing Sentry, and what outcomes it has driven for our engineering organization.

Challenges maintaining Prometheus LTS

In this article, we’ll cover the three main challenges you may face when maintaining your own Prometheus LTS solution. In the beginning, Prometheus claimed that it wasn’t a long-term metrics storage, the expected outcome was that somebody would eventually create that long-term storage (LTS) for Prometheus metrics. Currently, there are several open-source projects to provide long-term storage (Prometheus LTS). These community projects are ahead of the rest: Cortex, Thanos, and M3.

Introducing Logz.io Event Management: Accelerating Collaborative Threat Response

In the domain of cyber threat response, there’s a critical resource that every organization is desperately seeking to maximize: time. It’s not like today’s DevOps teams aren’t already ruthlessly focused on optimizing their work to unlock the greater potential of their human talent. Ensuring your organization to identify and address production issues faster – and increase focus on innovation – is the primary reason why Logz.io and its observability platform exist.

How to Measure Packet Loss | Obkio

Packet loss is one of the core network metrics that you should be measuring when monitoring your network performance. The most accurate way to measure packet loss is by using a Network Monitoring Software, like Obkio. This frequency is essential because packet loss is based on a percentage. For that percentage to be accurate, you need to monitor continuous volume. Easily see the percentage of packet loss anywhere in your network with updates every 500ms.

Dashboard Fridays: Sample Symantec Endpoint Protection (SEP) Dashboard

Join SquaredUp's Adam Kinniburgh and SCOM community hero Ruben Zimmermann as they showcase this example SEP Dashboard. Giving an overview of the status of the various endpoint protection systems, this dashboard is used by the IT team to keep on top of device security, and by the service desk to escalate appropriately.

Video: The new simple, scalable deployment for Grafana Loki and Grafana Enterprise Logs

With the recent release of Loki 2.4 and Grafana Enterprise Logs 1.2, we’re excited to introduce a new deployment architecture. Previously, if you wanted to scale a Loki installation, your options were: 1) run multiple instances of a single binary (not recommended!), or 2) run Loki as microservices. The first option was easy, but it led to brittle environments where a heavy query load could take down data ingestion and problems were often difficult to debug.

Simple, scalable deployment for Grafana Loki and Grafana Enterprise Logs

Loki 2.4 and GEL 1.2 introduced a hybrid deployment model that takes the simplicity of running the Loki log aggregation solution as a single binary and introduces an easy path to high availability and scalability. Particularly for organizations running on virtual machine and bare metal (non-Kubernetes) environments, this is a game-changer! Learn more in this tutorial from Grafana Labs Senior Software Engineer Trevor Whitney.

Streamline Issue Management and Communication at Scale: Power Home Remodeling and Sentry

When it comes to managing multiple applications and services, driving alignment and communication across teams can be like herding cats. Too many channels, projects, and cross-functional stakeholders can cause friction that slow down issue management and affect the overall product experience.