Operations | Monitoring | ITSM | DevOps | Cloud

Datadog

How to spot and fix memory leaks in Go

A memory leak is a faulty condition where a program fails to free up memory it no longer needs. If left unaddressed, memory leaks result in ever-increasing memory usage, which in turn can lead to degraded performance, system instability, and application crashes. Most modern programming languages include a built-in mechanism to protect against this problem, with garbage collection being the most common. Go has a garbage collector (GC) that does a very good job of managing memory.

How we used Datadog to save $17.5 million annually

Like most organizations, we are always trying to be as efficient as possible in our usage of our cloud resources. To help accomplish this, we encourage individual engineering teams at Datadog to look for opportunities to optimize. They can share their performance wins, big or small, in an internal Slack channel along with visualizations and, often, calculations of the resulting annual cost savings.

Optimize your AWS costs with Cloud Cost Recommendations

Managing your AWS costs is both crucial and complex, and as your AWS environment grows, it becomes harder to know where you can optimize and how to execute the necessary changes. Datadog Cloud Cost Management provides invaluable visibility into your cloud spend that enables you to explore costs and investigate trends that impact your cloud bill.

Operator vs. Helm: Finding the best fit for your Kubernetes applications

Kubernetes operators and Helm charts are both tools used for deploying and managing applications within Kubernetes clusters, but they have different strengths, and it can be difficult to determine which one to use for your application. Helm simplifies the deployment and management of Kubernetes resources using templates and version-controlled packages. It excels in scenarios where repeatable deployments and easy upgrades or rollbacks are needed.

Integration roundup: Understanding email performance with Datadog

Visibility into email health and performance is indispensable to any organization seeking to reach its customers through their inboxes. As they work to curtail spam, internet service providers (ISPs) are redefining the standards of deliverability on an ongoing basis, and organizations often struggle to adapt.

Get insights into service-level Fastly costs with Datadog Cloud Cost Management

As your organization scales its applications across many different cloud and SaaS providers, it becomes more challenging to understand your costs. You likely receive your bill at the end of the month, meaning you don’t have real-time visibility into who’s spending what and which services or applications your teams are spending the most on. Changing service costs also makes it difficult to break down your costs and identify what is driving spend, leaving you unable to take action.

Optimize Ruby garbage collection activity with Datadog's allocations profiler

One Ruby feature that embodies the principle of “optimizing for programmer happiness” is how the language uses garbage collection (GC) to automatically manage application memory. But as Ruby apps grow, GC itself can become a big consumer of system resources, and this can lead to high CPU usage and performance issues such as increased latency or reduced throughput.

Best practices for monitoring and remediating connection churn

Elevated connection churn can be a sign of an unhealthy distributed system. Connection churn refers to the rate of TCP client connections and disconnections in a system. Opening a connection incurs a CPU cost on both the client and server side. Keeping those connections alive also has a memory cost. Both the memory and CPU overhead can starve your client and server processes of resources for more important work.

Anthropic Partners with Datadog to Bring Trusted AI to All

At Datadog’s 2024 DASH conference, Anthropic President and Co-Founder, Daniela Amodei, announced the new Anthropic integration with Datadog’s LLM Observability. This new native integration offers joint customers robust monitoring capabilities and suite of evaluations that assess the quality and safety of LLM applications. Get real time insights into performance and usage, with full visibility into the end to end LLM trace. Enabling you to troubleshoot any issues, reduce downtime and get your Claude powered applications to market faster.