Operations | Monitoring | ITSM | DevOps | Cloud

%term

Alert fatigue, part 1: avoidance and course correction

Alert fatigue occurs when one is exposed to a large number of frequent alarms (alerts) and consequently becomes desensitized to them. This problem is not specific to technology fields: most jobs that require on-call, such as doctors, experience it in slightly different manners, but the problem is the same.

Icinga 2 DSL Feature: Namespaces coming in v2.10

Under the hood, Icinga 2 uses many constants and reserved keywords, e.g. “Critical” or “Zone” which are respected by the config parser and compiler. This sometimes leads to errors when users accidentally override such things, or re-define their own global constants. v2.10 introduces namespaces for this purpose, and ensures that such accidents won’t happen anymore.

The strange case of SNMP: A weird story about protocols

A crime had taken place in Manhattan Grow, a well-established company, so to speak, of invoices. Someone shouted “OH MY GOD, WE MUST FIND THE GUILTY ONE! There wasn’t enough coffee and doughnuts in the machine to calm the entire office. In the middle of summer, the sky was slowly walking towards dusk, slowly fading away. No one was going to leave Manhattan Grow until the mess was fixed. Heads were going to roll.

Kubernetes monitoring with Prometheus - Prometheus operator tutorial (part 3).

We covered how to install a complete ‘Kubernetes monitoring with Prometheus’ stack in the previous chapters of this guide. But using the Prometheus Operator framework and its Custom Resource Definitions has significant advantages over manually adding metric targets and service providers, which can become cumbersome for large deployments and doesn’t fully utilize Kubernetes’ orchestrator capabilities.

How IT Pros Can Maximize Efficiency With Uptime.com

IT professionals have to efficiently manage several dozen to several hundred critical pieces of infrastructure a modern business needs to stay afloat. Even smaller businesses often encounter this challenge. We understand that at every level, the time spent researching these issues comes at a cost. That’s why we’ve built some time-saving measures into Uptime.com to help you make more efficient use of your most precious resource: your time.

Monitoring Kafka in Production

Franz Kafka was a German-speaking Bohemian Jewish novelist and short story writer, widely regarded as one of the major figures of 20th-century literature. Apache Kafka, on the other hand, is an open-source stream-processing software platform. Due to its widespread integration into enterprise-level infrastructures, monitoring Kafka performance at scale has become an increasingly important issue.