Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

How we upgraded to MySQL 8 in Grafana Cloud

Starting around June this year, we upgraded our Grafana databases in Grafana Cloud from MySQL 5.7 to MySQL 8, due to MySQL 5.7 reaching end-of-life in October. This project involved tens of thousands of customer databases across dozens of MySQL database servers, multiple cloud providers, and many Kubernetes clusters.

How we manage incidents at Datadog

Incidents put systems and organizations to the test. They pose particular challenges at scale: in complex distributed environments overseen by many different teams, managing incidents requires extensive structure and planning. But incidents, by definition, break structures and foil plans. As a result, they demand carefully orchestrated yet highly flexible forms of response. This post will provide a look into how we manage incidents at Datadog. We’ll cover our entire process.

The Journey Into Automation: Optimizing Care Delivery

In a world where efficiency and precision are the cornerstones of progress, automation has become the unsung hero across diverse industries. From manufacturing floors to customer service, its transformative power has reshaped the way we work and deliver services. Today, we embark on a journey to explore the profound influence of automation on healthcare, where each automated process is a progressive step towards optimizing care delivery and reshaping the future of patient-centered care delivery.

A Simple Guide To AWS Lambda Rightsizing

AWS Lambda can be an easy-to-implement solution for those looking for serverless application deployments and operations. However, how can you be sure that you are getting the most for your money when it comes to utilizing this service? This guide will explain some of the tools and resources at your disposal — whether you’re utilizing the AWS console, trying to optimize your code and design, or a combination of the two.

MDM vs. MDM: What's the Difference Between Mobile Device Management and Modern Device Management?

When it comes to mobile device management versus modern device management, they may sound similar, but there’s a significant degree of difference between them. The explosive growth in these devices within enterprises makes it crucial for organizations to choose the right platform for overseeing them.

ELT: Extract Load Transform, Explained

Businesses today rely on analytics and insights derived from different data types for gaining competitive advantages. These data often come from different sources and in different formats. Without a unified solution, aggregating those data and performing analytics tasks is challenging. ELT has been invented to solve the complexities associated with processing data from multiple sources while retaining the raw data as it is.

Customer Data Analytics: An Introduction

Simply put, customer analytics (or customer data analytics) is the process of using information about customer preferences and behavior to improve sales, marketing and product development. You can think of customer analytics as the type of customer behavior where buyers are doing internet research before making a purchase. There is now a vast amount of information available for nearly every product category online.

What's the difference between API Latency and API Response Time?

Your app’s networking directly affects the user experience of your app. Imagine having to wait a few seconds for the page to load. Or even worse, imagine waiting for a few seconds every time you perform an action. It would be infuriating! Before you go on a fixing adventure, it’s a good idea to understand what causes that waiting time. So let’s do that!

Kubernetes Clusters: Everything You Need To Know

Kubernetes is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. It allows you to create and manage clusters of machines, called Kubernetes clusters, to run your applications in a scalable and highly available manner. Kubernetes clusters provide a distributed and scalable platform for running containerized workloads.