Datadog

Paving the Road for Proactive Reliability

Jan 5, 2024 By Datadog In Datadog

At Expedia Group, Kaushik Patel and Nikos Katirtzis have thousands of engineers and micro-services. Heterogeneity in terms of infrastructure and technologies used over the years created inefficiencies and posed the need for a set of automated best practices for our engineering teams. Over the past 2 years, using a data-driven approach, we’ve worked on creating a set of platforms that helps teams to adopt good reliability practices, including chaos engineering, release safety, or automatic failover between cloud regions. In this talk Kaushik and Nikos will cover the platforms they’ve built, including how they used data to drive their investment decisions.

View Video

Datadog

Read more about Paving the Road for Proactive Reliability

Building & Scaling Distributed Teams

Jan 4, 2024 By Datadog In Datadog

To be successful, distributed teams require different strategies, processes, and rituals than co-located teams. Learn a variety of tips to build, nurture, and scale distributed teams across offices, home offices, timezones, countries, and cultures.

View Video

Datadog

Read more about Building & Scaling Distributed Teams

Detect Java code-level issues with Seagence and Datadog

Jan 4, 2024 By Emily Chang In Datadog

In Java applications, concurrency issues can be difficult to reproduce and debug. Because work is scheduled nondeterministically across threads, the conditions that have led to an error in one execution of the program may not trigger the same issue the next time around. Exceptions that are silently handled—also known as swallowed exceptions—can also be challenging to debug because they typically do not leave any trace in the logs.

Read Post

Datadog

Read more about Detect Java code-level issues with Seagence and Datadog

Quickly remediate issues in your Azure applications with Datadog Workflow Automation

Jan 3, 2024 By Syed Sarjeel Yusuf In Datadog

Datadog Workflow Automation speeds up incident response and remediation for DevOps, SRE, and security teams by enabling them to automatically run predefined task sequences whenever specific alerts or security signals are triggered. After the feature’s initial release in 2023, Datadog is now excited to announce a significant expansion of its Workflow Automation capabilities with Azure actions, allowing engineers to create automated workflows for their Azure resources for the first time.

Read Post

Datadog

Read more about Quickly remediate issues in your Azure applications with Datadog Workflow Automation

Improve your shift-left observability with the Datadog Service Catalog

Dec 26, 2023 By Thomas Sobolik In Datadog

Your applications are only as powerful as they are iterable. To keep up with their rapidly changing production environments, your teams need reliable CI/CD systems that implement best practices—including build and test automation, flaky test management, and deployment management. By optimizing their CI/CD pipelines, your teams can build their apps more efficiently, deploy them more safely, and catch bugs and security vulnerabilities before they make it to production.

Read Post

Datadog

Read more about Improve your shift-left observability with the Datadog Service Catalog

Why Glovo Chose Database Monitoring to Gain Context for Troubleshooting Issues

Dec 22, 2023 By Datadog In Datadog

Hear from a Glovo engineer how Datadog Database Monitoring helped their storage team reduce costs and time spent on databases, as well as peak CPU usage.

View Video

Datadog

Read more about Why Glovo Chose Database Monitoring to Gain Context for Troubleshooting Issues

How Toyota is using Datadog and AI/ML to invent new ways for humans to be more mobile #datadog

Dec 21, 2023 By Datadog In Datadog

Toyota is best known for making great cars and trucks, and as a leader in technology and mobility, they are on a mission to build a better future where everyone has the freedom to move. By partnering with Datadog, Toyota is taking advantage of the latest AI/ML to innovate and invent new ways for humans to be more mobile, while future proofing Toyota’s tech stack.

View Video

Datadog

Read more about How Toyota is using Datadog and AI/ML to invent new ways for humans to be more mobile #datadog

Investigate your log processing with the Datadog Log Pipeline Scanner

Dec 20, 2023 By Thomas Sobolik In Datadog

Large-scale organizations typically collect and manage millions of logs a day from various services. Within these orgs, many different teams may set up processing pipelines to modify and enrich logs for security monitoring, compliance audits, and DevOps. Datadog Log Pipeline let you ingest logs from your entire stack, parse and enrich them with contextual information, add tags for usage attribution, generate metrics, and quickly identify log anomalies.

Read Post

Datadog

Read more about Investigate your log processing with the Datadog Log Pipeline Scanner

Scaling Up, One Network Bottleneck at a Time #shorts #datadog

Dec 20, 2023 By Datadog In Datadog

Processing data at scale involves moving packets through a network—but what happens when that network isn't cooperative? Anatole Beuzon, a Software Engineer at Datadog, discusses how he investigated and resolved network issues in Datadog’s larger data-processing apps and how you can apply these same methods to your own production workloads.

View Video

Datadog

Read more about Scaling Up, One Network Bottleneck at a Time #shorts #datadog

Monitor Ray applications and clusters with Datadog

Dec 18, 2023 By Bowen Chen In Datadog

Ray is an open source compute framework that simplifies the scaling of AI and Python workloads for on-premise and cloud clusters. Ray integrates with popular libraries, data stores, and tools within the machine learning (ML) ecosystem, including Scikit-learn, PyTorch, and TensorFlow. This gives developers the flexibility to scale complex AI applications without making changes to their existing workflows or AI stack.

Read Post

Datadog

Read more about Monitor Ray applications and clusters with Datadog

Operations | Monitoring | ITSM | DevOps | Cloud

Datadog

Paving the Road for Proactive Reliability

Building & Scaling Distributed Teams

Detect Java code-level issues with Seagence and Datadog

Quickly remediate issues in your Azure applications with Datadog Workflow Automation

Improve your shift-left observability with the Datadog Service Catalog

Why Glovo Chose Database Monitoring to Gain Context for Troubleshooting Issues

How Toyota is using Datadog and AI/ML to invent new ways for humans to be more mobile #datadog

Investigate your log processing with the Datadog Log Pipeline Scanner

Scaling Up, One Network Bottleneck at a Time #shorts #datadog

Monitor Ray applications and clusters with Datadog

Monthly Archive

Follow Us