Operations | Monitoring | ITSM | DevOps | Cloud

Major Cloud Outages of 2025

Cloud outages in 2025 ranged from minor ones affecting some sections of users, to major ones affecting hundreds or thousands of users. Services like Cloudflare and AWS on which many other services depend experienced outages that affected many due to the cascading effect. Let's look at some of the major cloud outages in 2025.

How to use AI to analyze and visualize CAN data with Grafana Assistant

Note: A version of this post originally appeared on the CSS Electronics blog. Martin Falch, co-owner and head of sales and marketing at CSS Electronics, is an expert on CAN bus data. Martin works closely with end users, typically OEM engineers, across diverse industries, including automotive, maritime, and industrial. He is passionate about data visualization and AI—and he’s been working extensively with Grafana Assistant.

How to use Gremlin's Reliability Report

Modern applications can easily include hundreds of discrete services, all of which need to be reliable in order for the application to function correctly. While running tests on a handful of critical services can lead to small reliability improvements, real impact requires testing and increased reliability visibility across your entire organization. That’s the logic behind the new, improved Reliability Reports within Gremlin.

AI Reliability, Part 2: When the Datacenter Becomes the Bottleneck

In Part 1, we talked about all the hidden complexity inside AI systems: the pipelines, GPUs, embeddings, vector databases, orchestration layers, and everything else that quietly determines how reliable an AI-first product really is. But all of that software still rests on something far less glamorous: the physical infrastructure underneath it.

Elastic and Microsoft partnership achievements in 2025

Highlights of another successful year of customer-centric collaboration Once again, our partnership delivered an impressive year of innovation with Microsoft Azure, Azure AI Foundry, and Azure OpenAI. This blog highlights our continued collaboration with Microsoft to better serve customers throughout 2025 and our key moments at Microsoft Ignite.

How Aerospace Companies Use InfluxDB

Over the past two decades, we’ve witnessed the instrumentation of virtually everything in the aerospace industry, from manufacturing floors to satellites orbiting Earth. And it’s no longer just NASA and other government organizations leading the charge. The commercial space industry has grown exponentially, with private companies developing everything from GPS satellites to electric VTOL aircraft.

AWS re:Invent 2025: 6 FinOps Signals That Mattered

This year’s AWS re:Invent was a blur of GPUs, LLMs, and infrastructure roadmap reveals — but for those listening between the keynotes, another story was unfolding. Between hallway chats, booth conversations, and live polls, a signal emerged from the noise: FinOps is growing up. Mature cloud teams aren’t just managing costs — they’re asking smarter, more strategic questions about value, forecasting, and engineering accountability.

13 Real-World FinOps Insights From Anderson Oliveira

On a recent episode of FinOps In Full Bloom, host Thalia Elie sat down with Anderson Oliveira, a Senior FinOps Account Manager at CloudZero. With more than two decades in IT and deep FinOps expertise, Anderson brought clarity, humor, and a refreshingly human perspective to the conversation. Their chat covered everything from visibility and budgets to cultural friction and how to shift teams from resistance to results. Here are 13 insights and takeaways every FinOps-minded leader should hear.

SQL Compare & SQL Data Compare v16: Introducing SQL Server 2025 Support, Enhanced Security & More

SQL Compare and SQL Data Compare v16 introduces SQL Server 2025 support and improved credential security. Plus, SSMS 22 integration is coming soon. We have just released a new major version of SQL Compare and SQL Data Compare – version 16. This major version has two big items and one coming soon.

Let's Encrypt 45-Day Certificate Expiration: Monitoring & More

The move by Let’s Encrypt from 90-day certificates to 45-day certificates is more than a policy shift. It changes how teams must manage renewals, detect failures, and validate that certificates are deployed consistently across distributed systems. A shorter lifecycle compresses the margin of error. Automation that previously limped along unnoticed now breaks on a far tighter schedule. And every misconfiguration hits users faster.