Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Cloud monitoring, security and related technologies.

Monitor the Azure Cosmos DB integrated cache with Datadog

Azure Cosmos DB is a fully managed NoSQL database that scales automatically with load and supports multiple APIs. This makes it easy to incorporate with your applications while removing the need to maintain your own database servers. The Cosmos DB integrated cache—which is now in public preview—is a new offering that can help reduce costs and improve performance for Azure Cosmos DB.

What Value Does a Cloud Data Platform Hold For Your Business?

It has been roughly two decades since cloud computing first appeared on the scene, and yet, despite overwhelming evidence of the business operational productivity improvements, cost-savings, and competitive advantages it provides, a significant remnant of the banking industry remains open without using it.

AWS Outage on Dec. 7, 2021 - When Did You Know About It?

If something isn’t working as expected, your customers will want to know. How quickly did you know that AWS’s us-east-1 region was having issues? Was it from an article online? Customer requests flooding into your support queue? A tweet?? Not being able to get into a PUBG match? Or speaking of matches, were you unable to message your last Tinder connection?

What is Cloud Repatriation and How to Avoid It with Cloud Cost Management

Cloud computing is one of the great technologies of our era. As such, enterprises everywhere are in a hurry to migrate to the cloud. However, one of the less-talked-about trends of our time is cloud repatriation: the process of enterprises reversing their decision, leaving the cloud, and returning to an on-prem setting. According to TechTarget, 85% of enterprises reported plans of repatriating their workloads from the public cloud in 2019.

Incident Review - AWS Outages Crash Major Online Services - Including Amazon

The following is an analysis of the Amazon Web Services incident on 12/07/2021. Millions of users were affected by an Amazon Web Services outage that took down major online services such as Amazon, Amazon Prime, Amazon Alexa, Venmo, Disney+, Instacart, Roku, Kindle, and multiple online gaming sites. The outage, which originated in the US-EAST-1 region on Dec. 7, 2021, is still ongoing at the time of blog publication.

Sponsored Post

Service Mocks: Scaling a SaaS Demo with Traffic Replay

Building, running and scaling SaaS demo systems that run around the clock is a big engineering challenge. Through the power of traffic replay, we scaled our demos in a huge way. A few weeks ago we launched a new demo sandbox. This is actually a second generation version of our existing demo system that I built a few months ago (codename: decoy). Because the traffic viewer page shows the most recent data by default, you need to constantly be pumping new data in there. Any type of real-time SaaS system is going to have a similar requirement. So this needs to be planned.

Using Codefresh with GKE Autopilot for native Kubernetes pipelines and GitOps deployment

Several companies nowadays offer a cloud-native solution that manages Kubernetes applications and services. While these solutions seem easy at first glance, in reality, they still require manual maintenance. As an example, an important decision for any Kubernetes cluster is the number of nodes and the autoscaling rules you define.