Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Redgate Flyway's Product Updates - March 2026

This is a guest post from Maxime Drobot. This month we’re bringing you official GitHub Actions for Redgate Flyway, usability improvements in Flyway Desktop, and a look at what’s new, what’s in preview. Plus: earlier visibility of code‑review results, helping teams keep quality high and reviews flowing smoothly as AI increases the volume of changes.

How Harness AI Helps Scale Platform-Wide Support | Harness Blog

--- Key Takeaway: Harness AI helped deflect 95% of the platform support tickets for a major financial institution --- These days, success is often measured by what doesn’t happen: When things go right, the software delivery platform is invisible. But what happens when an organization’s delivery velocity increases multifold? Can the platform still stay out of the way?

How to Plan a Successful CI/CD Migration Without Disrupting Developers | Harness Blog

Modern engineering teams run on CI/CD. It’s where pull requests get validated, artifacts get produced, and releases get promoted to production. That also makes CI/CD migration very risky because you're not just moving a "tool"; you're moving the workflow that developers use dozens or hundreds of times a day. The good news: disruption is optional.

A new Host Map for modern infrastructure

A host map is a visual representation of your infrastructure that displays hosts and related resources such as clusters, pods, and containers in a single, interactive view. We introduced the Datadog Host Map more than a decade ago to help you “know thy infrastructure” and answer critical questions: Does everything look healthy? Has anything changed? Does the shape of my environment match what I expect?

How to Automate Your Entire Cloud Deployment Lifecycle with IaC

In today's digital world, businesses depend on cloud infrastructure to run applications, manage data, and deliver services smoothly. However, managing cloud environments manually can quickly become complex and time-consuming. Teams often deal with repeated tasks, inconsistent setups, and unexpected errors.

Production Data Access for Developers: RBAC and DLP

If you run a software engineering tools team, you have almost certainly had this conversation: a developer asks for production data access to debug a real incident, and someone in the room says no. Not because the request is unreasonable (it isn’t), but because nobody wants to be the person who said yes when something goes wrong. That instinct is understandable. Production environments carry real risk. But the reflex to lock everything down has a cost that rarely gets accounted for.

Flaky Tests: The Quiet Killer of Productivity in Your CI Pipeline | Harness Blog

‍Flaky tests are automated tests that pass or fail inconsistently without changes to the code. In this guide, you’ll learn why flaky tests happen, how to detect them automatically in CI pipelines, and how modern platforms prevent them from slowing teams down. Your test went well three times yesterday. It didn't work this morning. You ran it again without changing anything, and now it works. Congratulations, you've just passed a flaky test, and now someone's day is going to be ruined.

Multi-Agent AI SRE Has Landed and Its Built for Your Most Complex Stacks

Once upon a time, a monolith running on a handful of servers meant that incident management, even at 2:17 AM, was something a single generalist could handle. One person with enough context across the stack could reasonably diagnose whether the database was choking, a config had changed, or a server was running hot. They’d fix it and go back to sleep.

Deployment strategies: Types, trade-offs, and how to choose

A deployment strategy is the method a team uses to move new code into a production environment. It determines how traffic shifts between versions, how much risk each release represents, and how quickly the team can roll back when something breaks. The choice isn’t academic: a mismatch between strategy and system can mean downtime, failed rollouts, or hours of manual recovery.