Operations | Monitoring | ITSM | DevOps | Cloud

Monitor Azure Managed Redis with Datadog

Azure Managed Redis is Microsoft’s fully managed, enterprise-tier in-memory data store. It is designed for the low-latency caching, session storage, and real-time data needs of modern applications, including AI workloads that depend on fast vector and embedding lookups. Because user-facing applications often query Redis directly, even small regressions in latency, hit rate, or memory pressure can degrade the user experience.

Deploy Datadog Kubernetes Autoscaling at scale

Every Kubernetes environment accumulates waste over time. Teams overprovision CPU and memory requests to avoid performance risk, run idle replicas to preserve headroom, and leave Horizontal Pod Autoscalers (HPAs) untouched long after workload behavior has changed. Some of this waste can be addressed at the node level, where Datadog Cluster Autoscaling helps teams rightsize capacity.

Unified observability for Alibaba Cloud with Datadog

Alibaba Cloud is a major cloud provider in APAC, offering industry-leading foundational AI models in addition to compute, managed databases, object storage, and Kubernetes through its Container Service for Kubernetes (ACK). Teams choose Alibaba Cloud for its infrastructure availability across Asia Pacific and its managed services. For SREs and platform engineers, that often means running Alibaba Cloud alongside AWS, Google Cloud, or Microsoft Azure.

Investigate funnel drop-offs with Product Analytics

For most product teams, funnels are a staple of the analytics toolkit despite a frustrating limitation. You can see which step users are dropping off at, but understanding why requires hours of manual slicing across segments, separate comparison views, and a lot of trial and error before you land on a useful hypothesis. And even when you find something meaningful, taking action typically means jumping to another tool, building a new segment, or filing a request with a data team.

Measure the real impact of AI coding tools on software delivery with Datadog AI Impact

Engineering teams have rapidly adopted AI coding tools, but organizations still struggle to understand their impact. Existing dashboards focus on activity, such as daily active users, acceptance rates, or lines of generated code, but these metrics don’t answer a more important question: Are teams actually shipping more, faster, and with fewer issues?

How to measure developer experience (DevEx) in the AI era

As AI coding assistants dramatically inflate PR counts, commit frequency, and lines of code, the limitations of individual output metrics have never been more apparent. A developer can now produce significantly more lines per session, but higher volume doesn’t guarantee that the code is stable, maintainable, or successfully running in production. GitClear analyzed over 200 million lines of code and found that code churn nearly doubled following widespread AI adoption.

Project and manage cloud spend with Datadog budget forecasting

Cloud and SaaS spending continues to grow across teams, services, and providers, changing too quickly for retrospective cost management workflows to keep up. Finance and engineering leaders often rely on last month’s reports or manually maintained spreadsheets, which don’t reflect current usage. As a result, teams lack context on how spend is trending and often discover budget overruns only after they’ve occurred.

How to audit and clean up monitors effectively

Alert fatigue and blind spots develop together. Monitoring stacks that generate noise while missing critical issues may have incomplete coverage or poorly configured alerts. As they grow reactively and without structured coverage assessment, both issues worsen. Teams will often add monitors when something breaks and tune thresholds when alerts become unbearable, but rarely audit their overall setup to see if it works.

How we made a SQL query optimization agent 59% more accurate using autoresearch and LLM Observability

Without experiment infrastructure to help you test your LLM applications, every research session starts with the same questions: What have we tried previously? What were the numbers? Which prompt version produced that result? Why did we discard that approach? The answers live in scattered notes, terminal history, and half-remembered conversations. Each handoff between sessions loses context. In practice, iteration can slow down as teams get bogged down in testing and analysis.

Explore Datadog metrics with Natural Language Queries

Metric exploration often begins with a simple question, but answering that question can require deep familiarity with metric names, tag structures, and query syntax. Experienced users spend time refining queries through trial and error, and newer users struggle to get started. As a result, teams face delays in troubleshooting and analysis. Valuable observability data, including metrics that are difficult to discover and query, also goes underused.