Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

The 7 Most Common Incident Mistakes (and How to Prevent Them)

The hidden blockers slowing down your incident response and how to remove them before they become reliability risks. Incidents rarely go wrong because of one big failure. Most of the time, it’s a handful of small, familiar mistakes that slow teams down, muddy communication, or create confusion in the heat of the moment. Fortunately, these mistakes are predictable and fixable.

Packaging Operations Runbooks with Puppet Edge Workflows

Puppet Edge Workflows, available with Puppet Enterprise Advanced, provide the orchestration tools to define multistep workflows to run against your infrastructure. This allows Puppet experts to create workflows that Ops teams can run without having deep Puppet language knowledge or the underlying infrastructure.

Searching Certificate Transparency Logs (Part 2)

In the last post we discussed why we’re building our own Certificate Transparency (CT) search tool. There’s good background on the CT ecosystem in that post, so check it out if you haven’t. This post assumes a certain understanding of terminology covered previously. Now that we know where the CT logs live, and the different kinds of logs, we need to start reading them.

Vendor lock-in: not even once

Vendor lock-in remains one of the most significant concerns when choosing a cloud platform. When your data becomes trapped in proprietary formats or services, migration costs skyrocket and your flexibility disappears. This challenge affects organizations of all sizes, from startups planning for growth to enterprises managing complex compliance requirements.

5 Kubernetes Cost Management Insights From CloudZero's Latest Webinar

Kubernetes has reshaped how teams build and scale infrastructure, but it’s also made cost visibility a lot harder. For platform engineers, SREs, and FinOps leads, breaking down shared cluster costs, understanding per-team usage, and driving efficient resource allocation is still a major challenge. That’s why one CloudZero webinar with Umesh Rao, Director, Tech Enablement and John Hashem, Senior Sales Engineer, stood out.

What is Jira Service Management (JSM)? Key Features & Benefits Explained

Atlassian is shutting down OpsGenie. New sales stopped on June 4, 2025. Complete shutdown happens on April 5, 2027. Atlassian wants you to migrate to Jira Service Management (JSM). But like many OpsGenie users, you probably have questions. What is JSM? How does it handle alerting, escalation policies, and on-call schedules? What automation options does it have? Is it the right fit? And more. This blog breaks down everything you need to know.

How to Reduce Log Data Costs Without Losing Important Signals

You can cut your log costs by removing repetitive, low-value logs early and keeping only the parts that genuinely help you understand issues. Modern systems generate logs far faster than you expect. Even when your workload stays stable, infrastructure components, retries, and background workers continue producing a steady stream of repeated entries.