Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Build Your Kubernetes Monitoring Foundation with kube-prometheus-stack

When you run Kubernetes at scale, one of the first challenges is understanding what the cluster is actually doing. Workloads shift around, pods restart for normal reasons, and traffic doesn't always follow the patterns you expect. Having clear signals makes day-to-day operations much easier. That's where kube-prometheus-stack helps. It brings Prometheus, Grafana, Alertmanager, and supporting components together as a single package.

How OpenTelemetry can enhance observability in distributed systems: Practical examples

Observability has become one of the fundamental elements of performance and reliability as modern applications move toward cloud-native architectures, microservices, and multi-cloud. Traditional monitoring techniques often fall short in such dynamic, distributed environments. That’s where OpenTelemetry (OTel) , an open-source observability framework comes into picture.

OpManager streamlined IT for Detroit Wayne Integrated Health Network

When Detroit Wayne Integrated Health Network needed reliability at every heartbeat, they turned to ManageEngine OpManager. From chaos to clarity, OpManager unified their IT, reduced downtime, and powered faster, smarter care delivery. Discover how you can do the same.

Cisco & Auvik: Total Visibility and Control for Your Network with Auvik

Managing modern networks is complicated, and it’s easy for critical Cisco gear to quietly hit End-of-Sale (EOS) or Last Date of Support (LDOS) without anyone noticing. That can open the door to serious risks, technical debt, and compliance issues. Manual tracking and scattered tools just can’t keep up anymore. Watch this video to see how to stay ahead: Save Money and Reduce Headaches: Lower costs and tackle technical debt with smarter lifecycle management for your Cisco hardware.

Pastries with SREs: No compromises on cost-effective observability or donuts.

In this episode of Pastries and SREs, we dig into how vendor lock-in and sky-high observability costs are forcing teams to choose between coverage and budget, AND why you shouldn’t have to settle. With donuts in hand, we explore how to take back control of your observability strategy by making it cost-effective, comprehensive, and flexible.

Performance testing best practices: How to prepare for peak demand with Grafana Cloud k6

For many organizations, periods of high customer activity are anything but relaxing. Events like Black Friday, product launches, or major sales can put intense strain on the software and infrastructure systems that support a company’s web applications. Without proactive performance testing, these moments can quickly turn into poor user experiences and lost revenue.

How to Manage Grafana Access Groups for Team Control

Managing team access in Grafana can be tricky—especially as your organization grows. That’s where Grafana access groups (also known as Limited Access Groups in Hosted Graphite) come in. They allow you to define groups of dashboards and restrict which team members can access them. If you’re using Hosted Graphite with Grafana dashboards, this feature helps you organize teams, maintain data privacy, and simplify access control—all while giving users just the permissions they need.