Operations | Monitoring | ITSM | DevOps | Cloud


Best practices for using DORA metrics to improve software delivery

Software development and delivery requires cross-team collaboration and cross-service orchestration—all while ensuring that organizational standards for quality, security, and compliance are consistently met. Without careful monitoring, you risk a lack of visibility into delivery workflows, making it difficult to evaluate how they impact release velocity and stability, developer experience, and application performance.

Monitor your CI/CD modernizations with Datadog CI Pipeline Visibility

As your organization adopts modern technologies and scales its workloads, it’s critical that your CI/CD environment follows suit to maintain smooth development and testing workflows. Adopting modern CI/CD tools (e.g., pipeline runners and testing frameworks) and best practices can increase the agility and resilience of your CI/CD environment as well as enable your teams to configure new jobs, stages, and tests to meet changing business requirements.

Highlights from Google Cloud Next 2024

Over 30,000 people flocked to Las Vegas to see the latest and greatest from Google Cloud and its partners at Google Cloud Next 2024. As a long-time Google Cloud partner and recipient of two Google Cloud Technology Partner of the Year awards this year, we were there in full force to showcase our unified observability and security solutions and engage with the Google Cloud community.

Datadog Conversations: How Life360 Keeps Families Safe with Observability

Life360 is a family safety app driven by the mission to protect and connect people, pets, and things. Naveen Puvvula, Director of Cloud Operations, and Jesse Gonzalez, Senior Staff Site Reliability Engineer, discuss why observability is critical to achieving reliability and how they continue to deliver real-time location updates for their users even during high-traffic events. Finally, they share their advice for other tech leaders in the industry to choose partners that align closely to solve problems together and technologies that reduce friction and improve developer joy.

Accelerate incident investigations with Bits AI, Datadog's generative AI co-pilot

Learn how Datadog’s generative AI assistant, Bits AI, can help organizations accelerate incident investigations with auto-generated summarization to get you up to speed quickly, fetch information about past related events, update teams and statuses all through Slack.

Save up to 14 percent CPU with continuous profile-guided optimization for Go

We are excited to release our tooling for continuous profile-guided optimization (PGO) for Go. You can now reduce the CPU usage of your Go services by up to 14 percent by adding the following one line before the go build step in your CI pipeline: You will also need to supply a DD_API_KEY and a DD_APP_KEY in your environment. Please check our documentation for more details on setting this up securely.

Manage incidents seamlessly with the Datadog Slack integration

Modern, distributed application architectures pose particular challenges when it comes to coordinating incident management. DevOps, SREs, and security teams—often spread out across separate locations and time zones, and equipped with limited knowledge of each other’s services—must work quickly to collaboratively triage, troubleshoot, and mitigate customer impact.

Aggregate, correlate, and act on alerts faster with AIOps-powered Event Management

Maintaining service availability is a challenge in today’s complex cloud environments. When a critical incident arises, the underlying cause can be buried in a sea of alerts from interconnected services and applications. Central operations teams often face an overload of disparate alerts, causing confusion, delayed incident response, alert fatigue, and redundant resolution efforts. These issues can negatively impact revenue and customer experience, especially during an outage.

Track changes in your containerized infrastructure with Container Image Trends

Datadog’s Container Images view provides key insights into every container image used in your environment, helping you quickly detect and remediate security and performance problems that can affect multiple containers in your distributed system. In addition to having a snapshot of the performance of your container fleet, it’s also critical to understand large-scale trends in security posture and resource utilization over time.