Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

A Bright Outlook: Building Operational Resilience for the Year Ahead

As we step into a new year, one truth stands firm in financial services: resilience isn’t optional – it’s expected. Markets fluctuate, regulations evolve, and technology accelerates. Amid this complexity, IT leaders carry the responsibility of ensuring that operations don’t just survive disruption, they thrive through it.

New Year, New Telemetry: Resolve to Stop Breaking Dashboards

It's 2026. Your New Year's resolution was to finally migrate to OpenTelemetry. But you're staring at dozens of dashboards that depend on your current data format, and that migration deadline is looming... Sound familiar? If you're an SRE or Platform Engineer facing a top-down OTel mandate, you're not alone. The challenge isn't just about adopting a new standard—it's about doing so without disrupting the observability systems your team depends on every day.

How to Ensure AI-Generated Code is Reliable with Runtime Context

TLDR: AI coding assistants have sped up code delivery, but created a validation gap. Historic telemetry and static analysis cannot predict the behavior of unfamiliar, high-volume code. Lightrun’s Runtime Context MCP closes that gap, allowing AI assistants to verify behavior before it breaks, and resolve issues in real time.
Sponsored Post

Best Downdetector Alternatives for Outage Monitoring in 2026

To keep operations running, businesses and individuals increasingly rely on online services. When outages occur, having the right tools to detect and respond quickly is essential. Outage monitoring platforms provide real-time insights into service disruptions, helping minimize downtime and maintain productivity. While Downdetector is a widely recognized platform, its focus on consumer-level features may not fully meet business needs. Organizations relying on multiple third-party services require tools with advanced capabilities like deeper insights, customizable notifications, and seamless integrations.

Fair usage limits: a safer way to scale observability

For the past several years, Coralogix customers have used the platform to ingest, process, and analyze large volumes of observability data without the presence of artificial barriers or unexpected constraints. This flexibility has enabled teams to experiment freely, evolve their architectures, and scale smoothly alongside their systems.

Automating BGP Troubleshooting with Kentik AI Advisor

In this demo, we use Kentik AI Advisor to troubleshoot a real-world BGP misconfiguration that brings down a peering session with a transit provider. You’ll see how AI Advisor works both as a dedicated page and as an in-portal overlay, using natural language to identify the affected interface, correlate SNMP and syslog data, and pinpoint a maximum-prefix issue as the root cause. Then we accelerate and standardize the workflow with custom network context and AI-powered runbooks, so every engineer can troubleshoot BGP alerts like an expert.