Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

In the age of AI, measurement becomes our superpower

The last few years have felt less like a product roadmap and more like a scene from science fiction. Artificial intelligence didn’t simply arrive, it erupted. In what feels like a blink, we’re building software by prompting instead of programming. Our words now generate code, compose music, translate languages, and create entire digital experiences.

New features: Introducing Metrics Usage and Query Usage analyzers

As teams grow and telemetry scales, it becomes harder to keep track of which metrics matter. Labels pile up, cardinality increases, and costs start rising faster than anyone expected. At the same time, dashboards often stay quiet and alerts go untouched. The truth is, most teams don’t actually know how and how much of their metric data is being used, let alone which metrics are driving cost. This is exactly the problem we set out to solve.

Pastries with SREs: Holding onto extra observability data and desserts

In this episode of Pastries with SREs, we dig into why you should keep all of your observability data, even if you don’t need it quite yet. We explore: With enriched logs and flexible, cost-effective storage, you can stop worrying about what you might need later and start answering questions with confidence, no matter when they arise. Additional resources.

Stop Leaking PII in Your #Telemetry with Cribl Guard

Sensitive data sneaks into destinations more often than teams realize. In this clip, we capture live events, spot emails and login tokens slipping through, and fix it instantly with Cribl Guard. A few clicks, a commit and deploy, and Guard redacts the data in real time. No complex configs. No regex nightmares. Just fast protection that keeps your telemetry clean and your security tight.

How to Stream AWS CloudWatch Metrics into Grafana Cloud (10× Cheaper + Near Real-Time)

Unlock faster, cheaper, and more reliable AWS observability with CloudWatch Metric Streams in Grafana Cloud. In this video, Tristan from Grafana Labs gives a full walkthrough of our new AWS Metric Streaming integration, showing how to stream CloudWatch metrics directly into Grafana Cloud using Amazon Data Firehose and Terraform.

How to Monitor Unmanaged Networks & Remote Workers

Your remote developer can't access the VPN. Is it his home router? His ISP? Your network? You have no idea and no way to find out. This is the reality of modern IT. Your network doesn't end at your office perimeter. It extends into hundreds of homes, coffee shops, branch offices, and third-party locations you'll never set foot in. And when performance tanks, you're troubleshooting blind.

How Log Management and NDR Work Together to Speed Up Incident Response

Log management and Network Detection and Response (NDR) solutions are closely related but offer different layers of visibility. Rather than overlapping, they complement each other, together providing a connected view of what’s happening in your environment. How exactly? Let’s take a closer look.

Introducing Dataspaces & Datasets

Observability data has a habit of outgrowing everything else. As telemetry volume, variety, and velocity increases, staying organized gets harder. Governance becomes messy, and the cost of digging through “everything” keeps rising. Over the past year, Coralogix’s DataPrime engine has been addressing these challenges by laying a new foundation for observability at scale.

Detecting Anomalous Spans at Scale with DataPrime

Tracing is one of the most transformative gifts of observability. It allows engineers to follow a single request through a distributed system and see every span and dependency along the way. However, even with that visibility, some of our most basic questions stay unanswered. Why did a specific span behave differently today than it did yesterday? Why did latency rise even when nothing “broke”?

Advantages of Routing Security Data Where it Has the Most Value

Enterprise data volumes are doubling every two years, but security and observability budgets remain mostly flat (or in the worst-case scenario, are declining). As teams struggle to keep up, the challenge isn’t just the amount of data, it’s the inefficiency of how that data is collected, processed, and routed. Most organizations rely on a patchwork of agents, forwarders, and legacy collectors like Syslog to ingest telemetry from across the environment.