Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Cloud monitoring, security and related technologies.

The Best Cloud Storage Deals of Black Friday 2025

Looking for the best cloud storage deals? You’re in the right place, and since Black Friday is just around the corner, now is the perfect time. This time of year, companies offer their biggest deals on everything from tech gadgets, beauty, video games, and much more. But for cloud storage, we’ve got you covered with the best cloud storage deals of the year, allowing you to store, backup, sync, and share your files with friends, family, or colleagues.

Introducing Updog.ai: Real-time provider status from Datadog

When external SaaS providers or cloud services degrade or go down, engineers often find themselves wondering if the issue they're encountering is local or more widespread. The answers they find are usually slow to surface, limited in detail, or entirely dependent on the provider's updates. Vendor-controlled status pages and third-party aggregators don’t provide the timely, independent visibility that's necessary to quickly and accurately identify the root cause of slowdowns.

When AWS Goes Down: What It Means For Your Cloud Costs

A global outage at Amazon Web Services (AWS) did more than knock popular apps offline. It laid bare the cost risks embedded in many cloud architectures. As services fail, the hidden costs of high availability, from redundancy planning to recovery operations, often multiply. For cloud cost leaders, this isn’t an issue of uptime; it’s a visibility and budget-shock issue. It’s a key reminder that architecting for resilience involves difficult trade-offs.

PagerDuty Joins AWS QuickSuite: Connect Your Incident Management with 1,000+ Applications

Today, we’re announcing that PagerDuty is now available in AWS QuickSuite through the Model Context Protocol (MCP). This means PagerDuty’s incident management capabilities can now connect with the 1,000+ applications and data sources that QuickSuite integrates with, from AWS services to enterprise SaaS platforms, all accessible through natural language.

AWS Outage: How do you prepare for the failure of your own safety net?

When AWS’s massive outage struck, it didn’t just take down cloud services, apps, and enterprise platforms. It also knocked out many of the monitoring systems organizations depend on for real-time answers. Observability companies, including Datadog, New Relic, Checkly, Dynatrace, SpeedCurve, and Splunk Observability, lost visibility or functionality precisely when organizations needed them most.

Data Sovereignty in the Age of AI: A Conversation with Kelsey Hightower and Mark Boost

Join Kelsey Hightower and Mark Boost at Civo Navigate London as they discuss sovereignty in the context of AI and cloud computing. The conversation highlights the need for a more nuanced approach to cloud computing, one that balances the benefits of public cloud with the need for control and sovereignty. The discussion emphasizes the importance of open protocols and the role of the community in driving innovation, and notes that the adoption of AI workloads is driving a shift towards more decentralized and sovereign cloud architectures.

Kubernetes Security Guide: Risks, Strategies, And Tools

In 2018, attackers gained access to Tesla’s AWS cloud environment through an unprotected Kubernetes console (admin console). Because it lacked proper authentication, the hackers could see and control cluster resources. Once inside, they deployed new pods running cryptocurrency mining software, using Tesla’s compute power for profit. During the breach, the attackers also uncovered credentials stored in the cluster.

25 Sumo Logic updates to better monitor and secure your Azure environments

If you manage workloads across multiple clouds, you know how easy it is for critical alerts or performance issues to get lost in the noise. Switching between consoles, correlating logs, and tracking metrics across platforms can slow down troubleshooting, delaying incident resolution and increasing risk of missing critical alerts.