Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Best practices for monitoring and remediating connection churn

Elevated connection churn can be a sign of an unhealthy distributed system. Connection churn refers to the rate of TCP client connections and disconnections in a system. Opening a connection incurs a CPU cost on both the client and server side. Keeping those connections alive also has a memory cost. Both the memory and CPU overhead can starve your client and server processes of resources for more important work.

Key learnings from the State of Cloud Costs study

We recently released our initial State of Cloud Costs report, which identified factors shaping the costs of hundreds of organizations that use Datadog Cloud Cost Management to monitor their AWS spend. The report reveals several widely applicable themes, including the ways in which resource utilization, adoption of emerging technologies, and participation in commitment-based discount programs all shape cloud environments and costs.

Monitor Oracle Cloud Infrastructure with Datadog

Oracle Cloud Infrastructure (OCI) provides cloud infrastructure and platform services designed to support a broad spectrum of cloud strategies and workloads. OCI provides enterprise customers with scale-up resource scaling architectures, ultra-low-latency networks, and more to help them migrate legacy workloads to the cloud, while supporting cloud-native applications via an expansive network of cloud partners and services.

Monitor your Twilio resources with Datadog

Twilio is a customer engagement platform that helps organizations build communication features to meaningfully interact with customers on the channels they prefer. Twilio consists of a set of APIs for integrating communication tools such as voice, SMS, chat, video, and email into applications. Datadog’s Twilio integration collects a wide variety of logs to allow you to analyze performance issues and detect security threats across all of your Twilio resources.

Burn rate is a better error rate

While building our Service Level Objectives (SLO) product, our team at Datadog often needs to consider how error budget and burn rate work in practice. Although error budgets and burn rates are discussed in foundational sources such as Google’s Site Reliability Workbook, for many these terms remain ambiguous. Is an error budget a static quantity or a varying percentage? Does burn rate indicate how fast I’m spending a fixed quantity, or is it just another way to express error rate?

Optimize the performance of your Oracle databases with ITUnified's offering in the Datadog Marketplace

Many organizations use Oracle databases for their ability to be deployed anywhere, embedded security features, robust data analysis capabilities, and scalability. But manually managing Oracle databases can be impractical, requiring constant attention to optimize performance.

Enhance your GenAI application monitoring with Crest Data's Datadog Marketplace integrations

As organizations begin developing generative artificial intelligence (GenAI) applications, observability challenges could hinder their progress. Few robust monitoring tools for GenAI applications are available, which makes identifying and resolving issues in these applications time-consuming and error-prone.

Datadog named a Leader in 2024 Gartner Magic Quadrant for Observability Platforms

We are thrilled to announce that, for the fourth consecutive year, Datadog has been named a Leader in the 2024 Gartner Magic Quadrant for Observability Platforms. We believe that this placement reflects Datadog’s continued commitment to solving our customers’ most sophisticated challenges and building products that provide unmatched visibility into the performance, security, and cost of their traditional, cloud-based, or hybrid tech stack—from code to production.

Monitor Microsoft Fabric with Datadog

Microsoft Fabric is Microsoft’s new platform for all things data analytics—integrating key Azure data analysis products like Azure Data Factory, Azure Synapse, and Power BI into a unified platform. Fabric is intended to provide a one-stop shop where users with various levels of expertise across an organization can perform data analysis and collect insights.

Monitor your Anthropic applications with Datadog LLM Observability

Anthropic is an AI research and development company focused on building reliable and safe artificial intelligence systems. Their flagship product is Claude, an advanced language model and conversational AI assistant known for its strong capabilities in natural language processing, reasoning, and task completion. Anthropic places a particular emphasis on AI safety and ethics, and its models and APIs are used by organizations across various industries to build powerful, safe, and performant AI applications.