Operations | Monitoring | ITSM | DevOps | Cloud

Datadog

How to support a growing Kubernetes cluster with a small etcd

Etcd plays a critical role in your Kubernetes setup: it stores the ever-changing state of your cluster and its objects, and the API server uses this data to manage cluster resources. As your applications thrive and your Kubernetes clusters see more traffic, etcd handles an increasing amount of data. But etcd’s storage space is limited: the recommended maximum is 8 GiB, and a large and dynamic cluster can easily generate enough data to reach that limit.

Monitor your Pinecone vector databases with Datadog

Pinecone is a vector database that helps users build and deploy generative AI applications at scale. Whether using its serverless architecture or a hosted model, Pinecone allows users to store, search, and retrieve the most meaningful information from their company data with each query, sending only the necessary context to Large Language Models (LLMs). By providing the ability to search and retrieve contextual data, Pinecone enables you to reduce LLM hallucinations and enhance data security.

Best practices for monitoring event-driven architectures

Microservices architectures empower individual teams to choose their own programming language, tools, and technologies, resulting in more independence and the ability to develop and release features faster. While there are various types of integration patterns that can facilitate microservice communication, many organizations choose to adopt event-driven architectures (EDAs) because of their scalability, agility, and resilience.

re:Invent Recap Livestream: 2024

Did you miss this year’s re:Invent? Or maybe you were onsite but too busy deep diving on certifications, new products, and networking. Don’t worry—the Datadog team is streaming right to your home on December 17 to recap all of the highlights from the event. Join Andrew Krug from Datadog’s Technical Community along with a host of AWS guests to hear about exciting announcements from AWS re:Invent 2024, Datadog’s latest product launches, and a rundown of the best on-demand sessions that you’ll want to make sure to tune into.

This Month in Datadog: Monitor OpenAI costs, Kubernetes Active Remediation, IaC Security, and more

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. To learn more about Datadog and start a free 14-day trial, visit Cloud Monitoring as a Service | Datadog. This month, we put the Spotlight on Datadog Cloud Cost Management for OpenAI.

This Month in Datadog - December 2024

On the December episode of This Month in Datadog, Jeremy Garcia (VP of Technical Community and Open Source) covers Kubernetes Active Remediation, Datadog IaC Security, and a trio of new features for monitoring AWS resources. Later in the episode, Natasha Goel (Product Manager) spotlights Datadog Cloud Cost Management for OpenAI. Also featured is a short recap of Datadog at KubeCon North America and AWS re:Invent 2024.

Datadog Database Monitoring: Improve Database and Application Performance

Datadog Database Monitoring unifies query, application, and database telemetry in one platform, enabling teams to easily identify bottlenecks, understand database load, optimize query performance, uncover costly queries, and correlate database and application telemetry.

Increase visibility into network incidents using moovingon.ai and Datadog

moovingon.ai is a platform that consolidates alerts, incidents, audits, runbooks, and other resources for 24/7 network operations center (NOC) engineering teams. These teams often have to work collaboratively to maintain uptime for mission-critical cloud infrastructure and applications and need specialized resources to facilitate investigations in the event of an issue.

Highlights from AWS re:Invent 2024

Whether or not you made the journey to this year’s AWS re:Invent, there’s always a variety of great announcements lost amid an action-packed week of keynotes, breakouts, expo hall demos, and networking sessions. No need to worry—we’re always happy to be a big part of the re:Invent experience and share our observations with you. You can also join us on December 17, 2024, for a re:Invent re:Cap livestream by registering here.

Automatically group events and reduce noise with AI-powered Intelligent Correlation

When you have a complex IT environment with many disparate tools, data sources, and teams, alert noise becomes overwhelming. This can delay incident response and cause missed alerts, ultimately leading to critical incidents and outages. Datadog Event Management’s Event Correlation groups and deduplicates events and alerts, reducing noise and helping response teams act on alerts faster.