Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Application Performance Monitoring and related technologies.

How to surface misconfigured resources by defining policies | Datadog Tips & Tricks

Misconfigured infrastructure resources can be easy to miss, especially in multi-account or multi-cloud environments. From EKS clusters running on deprecated versions to RDS engines on extended support, these issues can disrupt services or drive up costs if left unchecked. In this video, we show you how to: By centralizing policies, you’ll gain a clear view of where to focus your remediation efforts.

Full-Circle Observability: Using SigNoz to monitor a LangChain agent that queries SigNoz MCP

In Part 1 of this series, we explored how to instrument a LangChain trip planner agent with OpenTelemetry and send telemetry data to SigNoz. By tracing each step of the planning process: LLM reasoning, tool calls for flights, hotels, weather, and activities, and the final itinerary response, we saw how observability turns a black-box agent workflow into a transparent, debuggable system.

LangChain Observability: How to Monitor LLM Apps with OpenTelemetry (With Demo App)

LangChain has become one of the most popular frameworks for building LLM-powered applications, making it easier to create agents that can reason, plan, and take actions. But like any production-grade AI app, LangChain agents can run into performance bottlenecks, hallucinations, or tool call failures. And without proper LangChain observability, it’s hard to know where things break down.

Put Cloud Costs in Front of Engineers with Datadog Cloud Cost Management

Tired of surprises on your cloud bills? With Datadog Cloud Cost Management integrated into the Software Catalog, engineers see cost, performance, and reliability side by side—no context switching required. Give every service owner the visibility they need to make cost-aware decisions.

Track Cloud Unit Economics with Datadog Cloud Cost Management

Do you know the true cost per user, API call, or checkout? Datadog Cloud Cost Management lets you break down spend by combining cost, observability, and custom business metrics—all in one place. Track cost per transaction, alert on changes, and align engineering and finance with real-time unit economics.

APM Logs: How to Get Started for Faster Debugging

When application performance monitoring detects a spike in latency or error rates, the immediate challenge is determining the underlying cause. APM logs address this by correlating performance metrics with the specific log events that occurred at the same time. Instead of switching between monitoring dashboards and manually searching through log files, APM log correlation consolidates both views.

Proactive Observability - Predictive Analytics Models and Algorithms for IT Systems and Metrics

Predictive Analytics Models and Algorithms are an important component of eG Enterprise’s AIOps engine for proactive observability. eG Enterprise collects and analyses metrics, events, logs and traces and the data including real usage data is used to make intelligent predictions to forecast future system behavior and IT resource metric levels.

How our engineers use AI for coding (and where they refuse to)

Okay, picture this: if you drew a Venn diagram of folks in tech right now, it'd probably look something like this: You'll probably find yourself in one of those circles, right? I’m guilty of falling in the intersection! Because let's be real, the 'will AI replace developers by 20xx?' debate is everywhere – Reddit, Hacker News, team Slack and even your local cafe. Well, we decided to go straight to the source.

A Practical Guide for Developers: Preventing PHP Mistakes with Performance Monitoring

Performance is one of the most critical aspects of any PHP application. A few seconds of delay or an unnoticed bottleneck can cause users to leave your site, increase bounce rates, and reduce business conversions. For developers, ensuring top performance is not always easy. Small coding mistakes, inefficient queries can accumulate into major problems over time. Without visibility into what’s happening inside the application, it becomes difficult to identify the root cause of slowdowns or failures.