Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Introducing: Checkly Agent Skills

AI coding agents are excellent at writing code. Ask Claude Code, Codex, or Cursor to add a feature, and it just works. At Checkly, we were ready for the new agentic world from the start! Monitoring as Code means your entire monitoring setup lives in your repository. API Checks, Browser Checks, alert channels, status pages; everything is defined in code, managed with the Checkly CLI, and version-controlled like any other part of your stack.

Improve performance and reliability with APM Recommendations

SREs and application developers rely on telemetry data to understand and improve their systems. As organizations scale and evolve, those systems generate an ever-growing volume of metrics, logs, and traces. But more data alone does not make it easier to improve performance or reliability: Identifying meaningful optimizations still requires careful investigation and analysis.

AI Query Assist for SolarWinds SQL Sentry

Rewrite inefficient SQL Server queries in seconds—not hours. In this demo, we show you how AI Query Assist in SolarWinds SQL Sentry transforms the way you tune performance. Watch how to take a problematic query from the "Top SQL" view and use generative AI to instantly generate optimized rewrites and uncover missing indexes. What you will see: Instant Optimization: How to automate query rewriting and get plain-language explanations of the logic changes.

Designing Alerts for Action

In the first two posts of this series, we explored how alert noise emerges from design decisions, and why notification lists fail to create accountability when responsibility is unclear. There’s a deeper issue underneath both of those problems. Many alerting systems are designed without being clear about the outcome they’re meant to produce. When teams don’t explicitly decide what they want to happen as a result of a signal, they default to the loudest option available.

Unlimited Team Sizes for All

Starting from today, Healthchecks.io users on all plans (Hobbyist, Supporter, Business, Business Plus) can invite an unlimited number of users into their projects. Previously, the limits were: 3 team members for Hobbyist and Supporter, 10 team members for Business, and unlimited team members for Business Plus. From now on, it is unlimited for all.

Top 6 Cloud Monitoring Challenges in Hybrid & Multi-Cloud Environments

Hybrid and multi-cloud monitoring breaks down when teams can’t connect signals to customer impact fast enough to act. Hybrid and multi-cloud sound simple: run some workloads in public cloud, keep some on-premises, and connect it all. But in practice, you’re managing dependencies across teams and systems, tools that don’t share context, and incidents that refuse to stay in one place.

Turn Raw Data into Reliability by Changing Performance Perspectives

In a global microservices architecture, technical performance initially presents as a chaotic stream of disconnected telemetry. For a Technical Program Manager (TPM), success depends on the ability to move past these disconnected individual data points to identify stable patterns. If they have services entering critical states, looking at individual logs or traces is inefficient. Protecting system reliability requires an engine that can automate pattern recognition at scale.