Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

What is Disaster Recovery Testing? Explained in 60 seconds | Resilience Testing | Harness

What happens when things suddenly break in your system? In this short video, we explain disaster recovery testing in simple terms. Learn why it matters, how it helps you stay prepared, and how you can make sure your system gets back up quickly when something goes wrong. Watch to understand the basics in under a minute.

How Much Does It Cost To Keep Up With The AI Joneses?

I’ve been an engineering leader for over a decade, and I’ve spent most of those years in private Slack groups with other engineering leaders, comparing strategies and kvetching about Kubernetes. Of the hundreds of threads I’ve taken part in, the one that got the most engagement the fastest was a recent one around AI adoption. “Where are you on this continuum?”, it read. “A. You don’t really care how people use AI; B. You push people to use AI; or C.

Beyond the spreadsheet: Using GitOps to generate DORA-compliant audit trails.

In the 2026 regulatory landscape, manual audits are a liability. This guide explores using GitOps to generate DORA-compliant audit trails through IaC, drift detection, and automated segregation of duties. Discover how the Qovery management layer turns compliance into an architectural output, reducing manual overhead for CTOs and Senior Engineers.

Lowering PUE: Building Envelope Efficiency in Edge Computing Units

Edge computing is changing how we handle data across the globe. Smaller units closer to the user need smart cooling to stay efficient. Compact systems handle big tasks in small spaces without needing giant server rooms. Power Usage Effectiveness (PUE) tracks how much energy goes to IT versus support. Improving the outer shell of units helps keep costs low. High efficiency is a goal for every tech site, and it saves money.

The Observability Gap: Why Monitoring Data Should Drive Tests

Most teams already know a lot about production. They have dashboards. They have traces. They have alerts. They have enough telemetry to explain what happened after an incident and enough graphs to argue about it for the rest of the week. Then they go to test a change and start from scratch. The integration tests hit a hand-written mock that returns {"status": "ok"}. The load tests replay a CSV somebody exported months ago. Staging is close enough to production right up until it matters.

#054 - From Shiny Objects to FinOps: Taming Cloud Costs in the AI Era with Josh Schlanger (CloudX...

In this episode of the Kubernetes for Humans podcast, we are joined by infrastructure and FinOps expert Josh Schlanger. Drawing on over 15 years of experience across Martech, e-commerce, and health tech, Josh shares why solving core business problems should always take priority over chasing new, "shiny object" technologies.

Women's Day Panel: Navigating the Future of Engineering in the Age of AI

How is AI reshaping engineering—and what does it mean for the future of work? At our first GTA Boston Hub event of the year, we brought together engineering leaders from Boston Consulting Group and Athenahealth to dive into one of the most pressing topics today: the rise of generative AI. In this panel, we explore: Key takeaway: This isn’t “human vs AI”—it’s human augmented by AI. The real advantage lies in how we adapt, collaborate, and lead in this new era.

AWS VPC Peering Vs. Transit Gateway: Which To Choose And Why [2026]

VPC peering can be simple and cost-effective in smaller setups. For growing multi-account platforms, Transit Gateway can offer predictable structure and centralized governance. But that’s not all. AWS VPC peering connects two VPCs directly with no hourly fee — simple and cost-effective at small scale, but it creates an unmanageable mesh as your VPC count grows.