Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Proactive Network Protection with Progress WhatsUp Gold 2025: SSL Certificate Monitoring That Helps Prevent Outages

A single expired SSL certificate can disrupt critical services, erode customer trust, and trigger a series of avoidable issues. That’s why we’re excited to introduce a powerful new feature in Progress WhatsUp Gold 2025.0: Certificate Discovery and Monitoring. This enhancement is more than just a checkbox on a release note; it’s a proactive safeguard designed to help you spot certificate issues before they escalate into business problems.

Going beyond AI chat response: How we're building an agentic system to drive Grafana

As we look at the role AI can play in Grafana going forward, we want to move beyond the simple chat responses that dominate the world of LLMs today and into agentic systems—AI that can understand, reason, and act on your behalf. The ultimate goal is to make it easy to get things done in Grafana using natural language—whether you’re a seasoned SRE or a new developer. And in the AI world, we call this moving from chat completion to task completion.

Get a better structure in your SCOM environment with the Opslogix Classification Management Pack

Get a better structure in your SCOM environment with the Opslogix Classification Management Pack Alerts in SCOM can easily become overwhelming, making your environment feel noisy and unstructured. The real challenge is how you can get the right amount of alerts to the right people, at the right time. The Opslogix Classification Management Pack includes features like tiered classification levels, dynamic grouping, and extended tagging.

What's New in InfluxDB 3.2: Explorer UI Now GA Plus Key Enhancements

InfluxDB 3.2 is now available for both Core and Enterprise, bringing the general availability of InfluxDB 3 Explorer, a new UI that simplifies how you query, explore, and visualize data. On top of that, 3.2 includes a wide range of performance improvements, feature updates, and bug fixes. InfluxDB 3 Core is free and open source, optimized for recent data, and licensed under MIT and Apache 2.

How we've created a successful FinOps practice at Datadog

When you adopt FinOps to maximize the value of your cloud spending, you may have some simple first steps you can take to gain cost efficiency. For example, you can find and delete any unused resources to quickly realize a one-time optimization. But the ongoing work to manage cloud costs becomes complex as your organization grows, your infrastructure spans multiple clouds, and you can't easily see the full value of your cloud spending by tracking only the bottom line.

Install Pandora ITSM from Pandora FMS Console

Until now, deploying Pandora ITSM required a standalone installation, manual database configuration, and later integration with Pandora FMS. With the new NG 783 version, that entire process has been simplified: Pandora ITSM can now be installed directly from the Pandora FMS web console, no additional servers, no external steps, and with integration already configured.

Built for Engineers: Datadog's Vision for the Future

Datadog was built by engineers, for engineers. At, Datadog Co-founder & CEO Olivier Pomel opened the keynote with a clear message: observability, security and AI are converging. From infrastructure to AI Agents, the future of engineering requires one unified platform. Catch all product announcements to see what’s next in observability and security on our Youtube channel!

Prometheus Gauges vs Counters: What to Use and When

Choosing the wrong metric type in Prometheus can lead to inaccurate dashboards, false positives in alerting, and missed indicators of system failure. Gauge metrics are intended for tracking values that can go up and down, such as memory usage, queue depth, or the number of active connections. Unlike counters, which only increment (or reset on restart), gauges reflect the current state of a resource at scrape time.

Optimizing mobile website performance using digital experience monitoring

Delivering an exceptional mobile user experience (UX) is critical for business success. As mobile devices cause over 60% of global web traffic(2024) from billions of active users, a subpar mobile experience can lead to lost customers and revenue. Slow-loading pages and design-induced poor interactivity and unstable layouts frustrate users. Bad UX drives disgruntled users quickly to competitors via a one-way street.

How to Reduce Application Downtime with APM?

According to a recent 2025 study, the average cost of downtime has inched as high as $9,000 per minute for large organizations. For higher-risk enterprises like finance and healthcare, downtime can eclipse $5 million an hour in certain scenarios. Whether you're part of a DevOps team, an SRE, a developer, or an engineering manager, minimizing application downtime should be a critical focus. One of the most effective ways to achieve this is through Application Performance Monitoring (APM).