Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Orbital Materials: WorldClass AI Models Built on CivoStack

Daniel Miodovnik, COO of Orbital Materials, explains how the CivoStack enables world‑class AI models that outperform the big‑tech giants. He outlines the power‑draw and cooling of megawatt‑scale GPU racks, the water‑ and CO₂‑intensity of today’s data centres, and why a sovereign, Civo‑based solution is the key to speed, and predictable costs.

What Is AWS Step Functions? A Complete Guide

Imagine you are building an e-commerce app. Every time a customer places an order, a lot happens behind the scenes. For example, you need to charge their card, update inventory, create a shipping label, and send a confirmation email. You could try to write one giant program that does everything in the correct order, but that quickly becomes a tangled mess — especially if something fails halfway through (say, payment succeeds but inventory update fails).

Announcing CloudZero's Oracle Cloud Connector: Real Cost Intelligence For AI And High-Performance Workloads

For years, enterprises have turned to Oracle Cloud Infrastructure (OCI) for what it does best: powering mission-critical applications with unmatched performance, security, and predictable economics. OCI has historically staked its reputation on being the go-to platform for organizations running complex, data-intensive workloads, from core databases and ERP systems to large-scale compute clusters, while putting extra focus on security and predictable pricing.

Developer Onboarding

Welcome to the future of developer onboarding with Cortex. In this demo, you’ll see how Cortex helps new engineers ramp up faster by giving them instant access to everything they need—context, ownership, best practices, and workflows—all in one place. What you’ll learn in this video: With Cortex, onboarding becomes a structured, data-driven, and empowering experience. Developers can explore your ecosystem confidently, follow golden paths, and start delivering value immediately—reducing ramp-up time from months to days.

MTBF, MTTR, MTTF, MTTA: Incident Metrics Explained

No doubt that incidents are inevitable. However, it’s how you manage them (detect, respond to, and resolve) that matters. And a robust incident management process relies on data, not guesswork. Incident Management metrics like MTBF, MTTR, MTTF, and MTTA provide measurable insight into reliability, response time, and recovery performance. When used together, they help identify weaknesses, reduce downtime, and build more resilient systems.

Harness patent for hybrid YAML editor enhances CI/CD workflows

Harness earned a patent for it's unified pipeline editor which makes it easy to configure pipelines whether they are for CI, CD, IaC, database migrations, service onboarding or other DevSecOps activities. ‍ We're thrilled to share some exciting news: Harness has been granted U.S. Patent US20230393818B2 (originally published as US20230393818A1) for our configuration file editor with an intelligent code-based interface and a visual interface.

SRE vs DevOps vs Platform Engineering: What Are the Key Differences

Software delivery is more complex than ever. Teams need speed, reliability, and scalability to stay competitive. Site Reliability Engineering (SRE), DevOps, and Platform Engineering are three key disciplines that address these challenges. Though these terms are often used together, they are not the same and share distinct differences. In this blog, we’ll discuss each term individually, compare SRE vs. DevOps vs. Platform Engineering, and also show how they work together.

Observability vs. Monitoring: What's the Difference?

Modern systems are complex, distributed, and fast-changing, so keeping them reliable requires more than watching dashboards. Observability vs. Monitoring explains how teams gain the deep insight needed to detect, diagnose, and resolve issues. Monitoring collects predefined metrics and alerts you to known problems, while observability provides rich, contextual telemetry to investigate unknown failures.

When Breaches Expose Your Secrets: Why Automation is the Key to Fast, Scalable Remediation

In early October, Red Hat disclosed a breach of a GitLab system used by its Consulting division. Threat actors claim to have exfiltrated hundreds of gigabytes of project data — and while investigations are still underway, reports suggest consulting engagement artifacts may have been impacted. For the organizations involved, the concern isn’t limited to reputational damage.