Operations | Monitoring | ITSM | DevOps | Cloud

Ghosts of Servers Past: The Bare-Metal Comeback Story

Bare-metal. Just reading that word might trigger a physical reaction for some of us. Dusty closets, old server rooms, and loud rigs that never seemed to work quite right. Remember waiting days for IT to provision a server, only to realize your ticket got lost in the shuffle? Or the classic "well, it worked on my machine" excuse right before a production push? Ah, the good old days.

Colsubsidio transforms business process monitoring with Elastic Observability

Colsubsidio is one of the largest and most representative family compensation funds in Colombia. The organization manages and delivers essential social services to millions of users through a broad network spanning health, education, subsidies, recreation, tourism, credit, housing, pharmacies, retail supply, culture, and labor welfare.

Keeping it boring: the incident.io technology stack

At incident.io we run a deliberately simple technology stack. Keeping things boring has allowed us to scale from a few hundred customers to several thousand, while having only two platform engineers. In this post I'll walk through the stack, explain some of the choices we've made, and touch on the challenges we're facing as we grow.

The Command Center Shift: Why the Future of Middleware is Unified, Predictive, and Transaction-Centric

Middleware is evolving beyond invisible plumbing into a strategic Command Center. The future demands unified management, predictive intelligence, and transaction-centric operations to move from reactive firefighting to operational mastery in 2026.

Enable end-to-end visibility into your Java apps with a single command

Achieving end-to-end observability for applications is a top priority for organizations today, but instrumenting for both frontend and backend monitoring can be a significant hurdle. What complicates matters is that the SREs and DevOps teams responsible for deploying monitoring tools typically don’t own frontend code or have the context needed to safely modify it.

Powering Security Innovation: Executive Q&A on Splunk Joining AWS Security Hub Extended

To succeed in the AI era, customers need fast, easy access to security solutions that can harness the power of agentic AI and deliver business outcomes. They need seamless access to their data for faster threat detection, simpler incident response, and reduced risk. They need technology vendors to work together and not in silos.

Inside the architecture: How Upsun delivers 99.99% uptime for AI

For a CTO, "four nines" represents a commitment to keeping production revenue live with less than 0.01% of total downtime per year. As AI workloads move from pilot projects into core production services, the reliability requirements for infrastructure have shifted. AI agents, RAG pipelines, and automated LLM workflows depend on a consistent platform state.

Build a Unified Operational Ecosystem with ServiceNow and Coralogix

During high-priority incidents, SRE teams frequently lose critical time switching between monitoring platforms and ticketing systems. Context switching like this forces engineers to manually update incident states by copying and pasting data. The inevitable result is increased risk of information gaps and slower Mean Time to Recovery (MTTR).

Unmasking the Resolute Raccoon

You’ve almost certainly seen them… In the forest, rummaging through a dumpster, in poorly aging millennial memes. Raccoons are ubiquitous and endlessly entertaining creatures. YouTube and TikTok are full of videos documenting their clever antics and escapades. One such intrepid raccoon gained fame for making their way to the most unlikely places, from liquor stores to karate studios.