Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Reliability isn't a metric, it's a mindset

As someone with Type 1 diabetes, reliability is a way of life for Nick Mason, Sr. Solutions Architect at Gremlin. Full transcript: Reliability isn't just a metric, to me, it's a mindset. As someone that works in site reliability engineering and also someone who lives with type one diabetes, the concept of reliability is deeply personal to me. In tech, reliability means building systems that are going to recover gracefully and in life with a chronic condition like diabetes, it's the same thing.

In The AI Era, The Winning Teams Track Cloud Unit Costs From Day 1

Everyone’s obsessed with speed right now. Ship fast. Stack features. Slap an LLM on it and call it v1. Amirite? But in the AI era, where cloud costs can spiral in a weekend, moving fast isn’t enough. The teams that track cloud unit costs from Day 1? They’re the ones who come out ahead. Most teams don’t start there though. They focus on building features and chasing traction, and the cloud bill just shows up like that subscription you forgot to cancel. Maybe someone glances at it.

Stop Losing Your Git Stash With This Easy Trick!

Got 12 unnamed stashes and no idea what’s in any of them? In this episode of Wait… Git Can Do That?, we show you how to list and pop a specific stash entry using stash@{n}. You’ll learn how to: Orient yourself with git stash list Pop a targeted stash with stash@{2} Keep it around using apply instead of pop No more mystery stashing. Just clean, precise Git workflows. Subscribe for more ways to make Git suck less.

2025 Guide & Template: Automating Production Readiness

When launches are delayed or incidents occur, it’s often due to a breakdown in production readiness. Maybe documentation is outdated. Maybe no one’s on-call. Maybe a critical dependency isn’t even known. The truth is, production readiness shouldn’t be a manual checklist. Production readiness needs to be as dynamic as the software being evaluated.

Panel Discussion: Understanding the importance of GPUs for AI success

Are you curious about the role of GPUs in AI and how they can accelerate your projects? Join Kunal Kushwaha (Field CTO), Ben Norris (AI Engineer), and Kendall Miller (Strategic Business Development) in this upcoming panel discussion as they dive into the world of GPUs and their significance in AI.

The Silent API Killer: Data Coupling in Your Tests

In API testing, speed, accuracy, and confidence in test results are everything. Regardless of whether you’re validating functionality, testing performance under load, or ensuring compliance with your security posture and standards, the ultimate goal is the same: catching problems before they reach production. But what if your tests are lying to you? Lurking beneath even the most sophisticated test suites is a subtle, pervasive threat: data coupling.

What's Next for Cloud in India? Help Shape the Future with Our Cost Survey

As the cloud computing industry continues to evolve in India, it's becoming increasingly important for organizations to understand the complexities and challenges associated with it. Last year, we released a whitepaper on the cost of cloud that gathered insights from over 500 industry professionals. While this helped us understand more about the rising cloud costs, complex billing models, and vendor lock-in within the UK, it was unclear how this differed for the Indian market.

5 Ways to Accelerate Product Delivery Without Managing Infrastructure

Is slow product delivery holding you back? This article explores how traditional infrastructure management creates significant bottlenecks, from time-consuming provisioning to inconsistent environments. Discover 5 strategies to streamline your delivery without managing infrastructure, including fully managed services, on-demand ephemeral environments, GitOps, self-service deployment platforms, and intelligent container orchestration.

Introducing Live Call Routing for Incident Response

Today, we are introducing Live Call Routing, a direct phone line that connects incoming calls to on-call engineers. It captures human-reported incidents that monitoring tools might miss—closing the loop between automated alerts and real-world observations so nothing falls through the cracks. It helps you respond to critical incidents faster by eliminating manual call routing, reducing response times from minutes to seconds.