Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

AWS outage leaves millions in the dark, points to need for better API monitoring

Millions of users of popular apps, such as Facebook, Ring, Alexa, Disney+, and more were left scratching their heads wondering when they would be back online due to a widespread Amazon Web Services (AWS) outage Tuesday. The outages were centered on a number of core AWS services in the US-EAST-1 Region, including increased API error rates with Amazon DynamoDB, Amazon Elastic Compute Cloud and Amazon Connect, which handles contact center calls. — AWS Service Health Dashboard

EC2 Reserved Instance: Everything You Need to Know

An Amazon Reserved Instance (RI) is one of the most powerful cost savings tools available on AWS. It’s officially described as a billing discount applied to the use of an on-demand instance in your account. To truly understand what RI is, we need to take a step back and look at the different payment options for AWS.

AWS Cloud Performance Anomaly Detection - A Real-life Case Study

Here’s a myth that needs to be debunked – the cloud will take care of my performance problems! Our experience shows that cloud architecture usually introduces new layers of complexities that did not exist in the on-premises world. You need a modern AI-powered full stack monitoring solution to find the needle in the multi-layered haystack that is the cloud. Sometimes, it’s the cloud vendor who has to fix the issue.

Estimating Your Cloud Costs is EASY. Do it in Just 3 Clicks.

One of our customers recently got their first bill after moving their Linux and Windows workloads to Azure. Their bill was astronomical! They struggled to answer the question, “how much will it cost?” and their initial cost assessments were vague at best. Here’s what they did.

Share your failures, fix them faster with shareable activities

When you’re working with a Continuous Delivery workflow, you rely on building and deploying your websites in such a way that any improvements can be released into production any time. Identifying and fixing failures quickly is key to enabling rapid development cycles. But what happens when you’re looking into a failed build step, with no clue as to how to address it? You can now share links to specific lines within the activity logs.

What we learned from AWS's us-east-1 outage

In case you missed it, for several hours on December 7, 2021, AWS's us-east-1 region had an outage impacting multiple AWS APIs, taking out various websites across the internet. According to our own monitoring at OnlineOrNot, the outage started at 2021-12-07 15:32 UTC and began to recover well at 2021-12-07 22:48 UTC (with minor signs of life for a few minutes around 2021-12-07 20:08 UTC). Had we relied solely on AWS to update their status page before reacting, we would have been waiting a while.

What Value Does a Cloud Data Platform Hold For Your Business?

It has been roughly two decades since cloud computing first appeared on the scene, and yet, despite overwhelming evidence of the business operational productivity improvements, cost-savings, and competitive advantages it provides, a significant remnant of the banking industry remains open without using it.

Monitor the Azure Cosmos DB integrated cache with Datadog

Azure Cosmos DB is a fully managed NoSQL database that scales automatically with load and supports multiple APIs. This makes it easy to incorporate with your applications while removing the need to maintain your own database servers. The Cosmos DB integrated cache—which is now in public preview—is a new offering that can help reduce costs and improve performance for Azure Cosmos DB.

Incident Review - AWS Outages Crash Major Online Services - Including Amazon

The following is an analysis of the Amazon Web Services incident on 12/07/2021. Millions of users were affected by an Amazon Web Services outage that took down major online services such as Amazon, Amazon Prime, Amazon Alexa, Venmo, Disney+, Instacart, Roku, Kindle, and multiple online gaming sites. The outage, which originated in the US-EAST-1 region on Dec. 7, 2021, is still ongoing at the time of blog publication.

What is Cloud Repatriation and How to Avoid It with Cloud Cost Management

Cloud computing is one of the great technologies of our era. As such, enterprises everywhere are in a hurry to migrate to the cloud. However, one of the less-talked-about trends of our time is cloud repatriation: the process of enterprises reversing their decision, leaving the cloud, and returning to an on-prem setting. According to TechTarget, 85% of enterprises reported plans of repatriating their workloads from the public cloud in 2019.