Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Identify risky behavior in cloud environments

Risk assessment requires context. One of the primary challenges with protecting cloud environments is understanding how certain activity can lead to risk. Risky behavior can be categorized as any activity or action that increases the likelihood of an attack in your cloud environment. While certain activity may not be malicious on its own, it can expand an environment’s attack surface or indicate post-compromise behavior.

On-Premise to Cloud Migration Step-by-Step Guide for Network Management

Cloud adoption has reached a tipping point — 98% of U.S. organizations have already migrated at least some business operations to the cloud. Global cloud spending is projected to reach $1.3 trillion by 2025 as companies rapidly embrace off-premise solutions, with 63% of IT decision-makers reporting accelerated cloud migration plans over the past 12 months.

From Traditional Monitoring to AI-Enhanced Observability

Traditional monitoring approaches have served IT operations for decades, providing basic visibility into system health through predefined metrics and thresholds. However, these conventional methods face significant limitations when confronted with modern, complex environments: Static Thresholds and Rules Traditional monitoring relies heavily on manually defined thresholds and rules.

How to Migrate from SolarWinds to Auvik Without Downtime

Switching from one network management system (NMS) to another is a big decision for IT teams and MSP businesses. An NMS is the central hub for everything from network troubleshooting deep dives to planning hardware refresh cycles and enabling quarterly business reviews (QBRs). And even if a different platform is clearly a better fit for your organization than your current NMS, it’s important to consider the operational overhead of actually making the switch.

OpenTelemetry vs. Prometheus Usage: 2025 Observability Survey Analysis | Grafana Labs

Myrle Krantz, Director of Engineering at Grafana Labs, talks about vendor lock-in, OpenTelemetry vs. Prometheus, open source adoption, and other tooling findings from Grafana Labs’ third annual Observability Survey — featuring insights from over 1,200 practitioners across the globe.

MCP, Easy as 1-2-3?

Seems like you can’t throw a rock without hitting an announcement about a Model Context Protocol server release from your favorite application or developer tool. While I could just write a couple hundred words about the Honeycomb MCP server, I’d rather walk you through the experience of building it, some of the challenges and successes we’ve seen while building and using it, and talk through what’s next. It should be pretty exciting, so strap in!

Why you should embrace more incidents (seriously!)

We’re all looking for ways to improve on our incident response. We investigate various metrics and methodologies—all in the name of making sure our customers see the reliable and performant systems we’ve sought to build. In fact, all these efforts are leading us, as an industry, to finally realize the power of surprising anomalous events in our systems. They give us an opportunity to reexamine our expectations and see how our models of the sociotechnical system differs from reality.

Observability Trends for 2025

The evolving digital technologies and artificial intelligence (AI) fundamentally reshape business dynamics. Analyzing the growth and impact of running online businesses, several organizations from different industries started adapting this modern approach to create revenue streams and enhance their customer experience. On one end, it turned out to be a brilliant strategy; on the other, managing the complex business data and systems was a big challenge.

Database Monitoring Metrics: What to Track & Why It Matters

Let’s be honest—your database isn’t just another component. It’s the thing holding everything else together. When it slows down or fails, the ripple effects hit fast and hard. So keeping an eye on its performance? Non-negotiable. The challenge is, there’s no shortage of metrics you could monitor. But not all of them are useful.

Histogram Buckets in Prometheus Made Simple

Staring at a monitoring dashboard and still feeling like you're missing half the picture? Happens more often than you'd think. Especially when you're dealing with metrics like request durations or payload sizes—data that doesn’t behave nicely or fit into neat little averages. This is where Prometheus' histogram buckets step in. They're not just another metric type; they're a better way to track the messy, uneven world of performance data.