Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Observability in under 5 seconds: Reflecting on a year of grafana/otel-lgtm

With grafana/otel-lgtm, observability is just one Docker command away. Over the past year, grafana/otel-lgtm has simplified observability setups, helping developers get a complete OpenTelemetry stack running in under five seconds. With integrations for metrics, logs, traces, and now profiles via Grafana Pyroscope, it has become a go-to solution for demos, development, and testing, as evidenced by its growing community (1k stars on GitHub and growing!) and notable adopters.

How a Fortune 500 Company Eliminated 93% of IT Incidents in 72 Hours

Sometimes the biggest transformations begin with what sounds like the worst possible news. One day, this Fortune 500 technology company’s observability platform was running smoothly. The next, they learned their critical monitoring solution would be discontinued as part of a corporate buyout. For a leading global IT vendor in data infrastructure serving customers across storage, cloud, and managed services, this was a potential catastrophe.

An open-source SDK for finding dead code

Writing code is easier than ever. We want to make deleting code just as easy – introducing Reaper for iOS and Android. Reaper was an Emerge Tools product that helped companies like Duolingo delete 1% of their iOS codebase. And just like with Emerge Tools’ Launch Booster, we’re making Reaper open-source for anyone to use. In this post, we’ll explain what Reaper is, why you should care about dead code, and how Reaper works on both platforms.

See System Logs Alongside your Metrics Using Loki, Grafana, and Graphite

In this quick demo, we show how you can transform logs collected by Grafana Loki into actionable Graphite metrics using MetricFire. Watch as we convert structured logs into performance insights. Perfect for teams looking to bridge the gap between logging and monitoring. This workflow helps you move beyond basic log storage and turn raw logs into meaningful metrics for alerts, dashboards, and capacity planning.

How Replicas Work in Kubernetes

Replicas in Kubernetes control how many copies of your pods run simultaneously. They're the foundation of scaling, availability, and recovery in your cluster. When you're running a stateless API or a background worker, understanding how replicas work directly impacts your application's reliability and performance. This blog walks through replica management, from basic concepts to production monitoring patterns that help you maintain healthy, scalable applications.

Improve Consistency Across Signals with OTel Semantic Conventions

It’s 2 AM. Your API is timing out. Logs show a slow query. Metrics flag a spike in DB connections. Traces reveal a 5-second delay on a database call. But then the questions start:- Which database?- Does the query match the delay?- Why doesn’t this align with the connection pool metrics? Each tool uses different labels, db.name, database, sometimes nothing at all. Without a shared schema, connecting the dots is slow and frustrating.

From Weeks to Hours: How Technical Teams Are Driving Fast ROI

Speed is no longer a luxury in IT operations—it’s a requirement. When systems falter, alerts spike, or new services go live, time becomes the most valuable resource. And yet, many IT teams are still shackled to tools and processes that take weeks—or months—to show measurable value. The question technical leaders increasingly ask is: How fast can we get value? Not just dashboards. Not just data.

Enforce configuration standards with the Opslogix Compliance Management Pack

Enforce configuration standards with the Opslogix Compliance Management Pack Maintaining compliance is not just a matter of policy, it is a matter of operational stability and security. But with so many moving parts, configuration drift is almost inevitable. The Opslogix Compliance Management Pack helps identify these deviations by continuously verifying key system configurations and alerting when they fall out of alignment.

Ensure the availability of critical services with the Opslogix Core Windows Service Management Pack

Ensure the availability of critical services with the Opslogix Core Windows Service Management Pack In a typical SCOM environment, a lot of the Management Packs are designed to monitor services tied to a specific technology, such as SQL Server, IIS, or the Windows operating system itself. But what about services that don’t belong to any particular application but are essential across all servers?