Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How to Run Elasticsearch on Kubernetes

Elasticsearch stands as one of the most robust open-source search engines available today. Built on Apache Lucene, it handles complex search operations, real-time analytics, and large-scale data processing with impressive speed and accuracy. Kubernetes has transformed how we deploy and manage containerized applications. This orchestration platform automates deployment, scaling, and operations of application containers across clusters of hosts.

LangChain & LangGraph: The Frameworks Powering Production AI Agents

Your AI agent worked flawlessly in development, with fast responses, clean tool use, and nothing out of place. Then it hit production. A simple "What's our pricing?" query triggered six API calls, took 8 seconds, and returned the wrong answer. No errors. No stack traces. Unlike traditional systems, AI agents don't crash, they drift. They make poor decisions quietly, and your monitoring says everything's fine.

IT Monitoring News | July '25 Edition

Welcome to the July edition of the NiCE bi-monthly IT monitoring news! As we reach the height of summer, we’re thrilled to share the latest updates, insights, and resources to help you stay ahead in IT monitoring. With new developments and recent releases, there’s plenty to discover, enhance, and get excited about. Let’s jump in!

StatusGator now monitors 6,000+ services

Today, StatusGator monitors over 6,000 cloud services and tools — a massive expansion that reflects how far we’ve come, and how deeply embedded we are in the fabric of modern infrastructure. In today’s world, your product’s reliability depends on a web of vendors — authentication providers, analytics platforms, CDNs, payment processors, communication tools, and more. At 6,000+ services, StatusGator now reflects your entire digital supply chain.

Top 5 outages detected by StatusGator in June 2025

June 2025 saw several high-impact outages across popular cloud services — from infrastructure giants like Google Cloud to developer platforms like Supabase and Heroku. For IT teams, MSPs, and developers, even short service disruptions can have ripple effects across workflows and customer experience. At StatusGator, we continuously monitor thousands of services to detect issues in real time — often before they’re publicly acknowledged.

Faster incident response through distributed tracing: Inside Glovo's use of Traces Drilldown

It’s almost 1 p.m. on a Monday afternoon and you’re hungry. You pull up your meal delivery app and select your favorite restaurant and dish. Then you go to check out and nothing happens. Your frustration mounts as you get hungrier by the minute. But there’s frustration on the other side of that transaction as well—engineers are scrambling to figure out what’s wrong as orders drop and revenue losses rise.

Why GovRAMP-authorized observability matters for state, local, and education IT teams

Building on our FedRAMP Moderate authorization and our “In Process” status for FedRAMP High, Datadog for Government is now "In Process" for GovRAMP High Authorization, giving agencies a unified observability platform that meets the toughest public-sector security bars.