Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Application Performance Monitoring and related technologies.

Complete Guide to Redis Monitoring: Essential Metrics, Tools & Best Practices 2025

Redis is a powerful tool, but its position in the critical path of applications means that performance issues can have a widespread impact. Whether you use Redis as a cache, session store, or primary database, effective monitoring is essential to prevent slowdowns and ensure a responsive user experience. This guide provides a comprehensive walkthrough of Redis monitoring, covering the essential metrics you need to track, the tools available to you, and the best practices to adopt in 2025.

Why Observability Isn't Just for SREs (and How Devs Can Get Started)

Almost every other day, when I scroll past r/devops or r/sre, I see a post like this asking how a dev can get started with devops, observability, etc. Sample Reddit thread on how to get started with OTel This blog is an attempt for anyone lost to find their way into observability and a wake-up call for devs to they should think about observability more actively today than ever before. A dev’s observability playbook.

This Month in Datadog: Bits AI SRE, Datadog Data Observability, and more

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. To learn more about Datadog and start a free 14-day trial, visit Cloud Monitoring as a Service | Datadog. This month, we chat with two guests about Bits AI SRE and Datadog Data Observability.

Multi Factor Authentication for Synthetic Monitoring for AVD

Today, I’ll cover some of the basics of monitoring Multi-Factor Authentication and why ensuring MFA is implemented is essential, particularly in environments where remote access is possible. I’ll cover some recent, specific case studies where a lack of MFA has led to security breaches and the mechanisms the bad actors used.

AI Agents Console: Monitor the behavior and interactions of any AI agent in your stack

With Datadog's AI Agents Console, you can monitor the behavior and interactions of any AI agent that’s a part of your enterprise stack, whether that’s a computer use agent like OpenAI’s Operator, IDE agent like Cursor, DevOps agent like Github Copilot, enterprise business agent like Agentforce, or your internally built agents. You'll have full visibility into every agent's actions, insights into the security and performance of your agents, analytics on user engagement, and measurable business value from every agent, all in a centralized location.

New in APM

Datadog’s Latency Investigator for APM—now in Preview—automatically investigates hypotheses in the background, comparing historical traces and correlating change tracking, DBM, and profiling signals. This helps teams quickly isolate root causes and understand impact without combing through raw telemetry data. You can go from detection to resolution in a single workflow, and generate a pull request to apply a recommended fix, all without leaving Datadog..

Data Observability: Build confidence in the data life cycle

Datadog Data Observability provides a complete solution with quality checks (e.g., volume, row changes, freshness), custom SQL-based monitors, anomaly detection, column-level lineage across systems like Snowflake and Tableau, full pipeline visibility, and targeted alerts when data issues arise.

Observing Vercel AI SDK with OpenTelemetry + SigNoz

LLM-powered apps are growing fast, and frameworks like the Vercel AI SDK make it easy to build them. But with AI comes complexity. Latency issues, unpredictable outputs, and opaque failures can impact user experience. That’s why monitoring is essential. By using OpenTelemetry for standard instrumentation and SigNoz for observability, you can track performance, detect errors, and gain insights into your AI app’s behavior with minimal setup.

OpenTelemetry NestJS Implementation Guide: Complete Setup for Production [2025]

NestJS applications require comprehensive monitoring to ensure optimal performance and rapid issue resolution. As your application grows—spanning multiple services, databases, and external APIs—understanding what's happening under the hood becomes critical. That's where OpenTelemetry comes in. OpenTelemetry provides vendor-agnostic observability for your NestJS applications through distributed tracing, metrics, and logs.