Operations | Monitoring | ITSM | DevOps | Cloud

%term

Datadog on Cloud Workload Identities

Datadog operates dozens of Kubernetes clusters, tens of thousands of hosts, and millions of containers across a multi-cloud environment, spanning AWS, Azure, and Google Cloud. With over 2,000 engineers, we needed to ensure that every developer and application could securely and efficiently access resources across these various cloud providers.

Cloud Control Ep #31 Balancing Innovation and Trust in Cloud Storage with Ali Zafar

In the latest episode of Cloud Control, Ali Zafar, the Vice President of Engineering for Hybrid Infrastructure at Dropbox, joins host Shon Harris to discuss the challenges of managing a multi exabyte infrastructure. Ali shares his background working at companies like Tesla, Google, and Cisco, and how they have shaped his approach to cloud and hybrid infrastructure. He emphasizes the importance of leveraging emerging technologies like AI and the significance of culture and relationships in driving innovation and efficiency.

Introducing the Jenkins to Bitbucket Pipelines migration tool

CI/CD workflows are essential for modern software development, enabling scalability, seamless integration, and ease of workflow management. However, migrating these workflows between tools—such as from Jenkins to Bitbucket Pipelines—can feel daunting due to differences in syntax, custom configurations, and repository-specific setups. To address these challenges, we’re thrilled to announce the Jenkins to Bitbucket Pipelines Migration Tool.

OpenTelemetry vs OpenTracing - Key Differences and Migration Path

OpenTelemetry and OpenTracing are two closely connected open-source projects that enhance observability in modern distributed systems. They are designed to instrument application code for generating telemetry data. OpenTelemetry is a comprehensive, vendor-neutral framework that helps capture various types of telemetry data, while OpenTracing focuses specifically on tracing and provides a way to instrument applications for that purpose.

Elasticsearch achieves Certified Software Solution status for Microsoft Azure

As a trusted partner in the Microsoft ecosystem, Elasticsearch has achieved another significant milestone by becoming a Certified Software Solution for Microsoft Azure. This certification not only underscores our commitment to excellence but also reflects our dedication to delivering seamless data solutions for our customers.

Smarter search, Uptime Monitoring, and Session Replay updates to simplify your debugging

Whether it’s sitting through a meeting that should’ve been an email or reading a blog post written by AI – no one enjoys losing time they’ll never get back. That’s why we rolled out updates to help you fix problems faster while skipping the manual grind, including smarter search, customizable issue views, real-time uptime alerts, and Session Replay for Mobile.

How to Perform Health Checks on Your Kafka Cluster: Ensuring Optimal Performance and Reliability

When managing Kafka clusters, health checks are essential—not just a luxury. They’re your frontline defense in maintaining stability and performance, helping you catch issues before they snowball. Let’s dive into effective ways to assess your Kafka cluster’s health, from tracking key metrics to taking proactive steps that keep your operations running smoothly.

GitHub Status in 2024: Unveiling Patterns, Trends, and How to Stay Ahead

Note: The data presented in this analysis is based on information we collected from January 2024 to October 2024 and may contain errors or omissions. This post has been updated to include the latest dataset. GitHub and its components are used by developers and businesses around the world to power everything from small projects to large-scale operations. This is why it's crucial to understand the platform's reliability as a core business enabler.

Organizing ownership: How we assign errors in our monolith

At incident.io, we run on a monolith. This brings a whole load of benefits that we don’t want to give up any time soon. We don’t have to worry about the speed of internal network requests, complex deployments, or optimizing work that touches multiple services. This blog post isn’t about the relative benefits of monoliths though (but we’ve written more about that here if you are interested)! Ownership in monoliths is tricky.