Operations | Monitoring | ITSM | DevOps | Cloud

Blue Matador

Introducing Guardian DevOps

I started Blue Matador in 2016 to help people like me. Site reliability engineers and devops engineers time is in short supply while the demands keep growing. We support an increasing number of applications, microservices, tools, libraries, languages, runtimes, pipelines, analytics and BI suites, and more. At the same time we’re supporting more applications, the applications themselves are growing more and more complex both from a deployment and a management angle.

How to Monitor Amazon RDS with CloudWatch

Amazon RDS allows you to store your application data in databases without having to actually manage the servers the databases are hosted on. It also allows you to easily set up read replicas and take snapshots of your database. However, since it’s a managed service, you have less visibility with traditional monitoring tools. As such, it becomes even more important to take advantage of the available monitoring tools in AWS.

5 Tips to Avoid Deadlocks in Amazon RDS Part 2

If you missed the first 2 tips, go back and read 5 Tips to Avoid Deadlocks in Amazon RDS (Part 1), and then come back for the last 3 tips on deadlock avoidance. Once again, I want to re-emphasize that RDS is not actually capable of creating deadlocks — it merely reports them from the underlying database engine.

5 Tips to Avoid Deadlocks in Amazon RDS Part 1

Last week, I wrote A Beginner’s Guide to Deadlocks in Amazon RDS. This week, I’d like to lay out my 10 years of experience about how to avoid deadlocks altogether. Often times, this will be out of the hands of operations people, but you can still move for dev changes based on issues in production. The more knowledgeable you are about deadlocks in general, the more they will lean on you as a resource with wisdom, not a totalitarian barking rules.

Beginner's Guide to Deadlocks in Amazon RDS

Although AWS sometimes feels like magic, it’s just software that controls capacity and allocation on their previously provisioned hardware. RDS is one of the services that can feel especially magic, because of the general difficulty and drudgery required to set up and manage a production database. In a matter of minutes, anyone can have a production database, complete with replication, automatic failover, backup schedules, and point-in-time recovery.

Upgrading Your AWS Kubernetes Cluster By Replacing It

With the recent panic over the zero-day Kubernetes vulnerability CVE-2018-1002105, Kubernetes administrators are scrambling to ensure their Kubernetes clusters are upgraded to a version that is patched for the vulnerability. As of this writing, the minimum versions that have the patch are 1.10.11, 1.11.5, 1.12.3, and 1.13.0-rc.1.

How to Diagnose and Fix AWS Lambda Iterator Age

AWS Lambda can use stream based services as invocation sources, essentially making your Lambda function a consumer of those streams. Stream sources include Kinesis Streams and DynamoDB streams. When you allow streams to invoke your Lambda function, Lambda will emit a CloudWatch metric called IteratorAge. In this post, we discuss what this metric is and how to fix it if it’s too high.

How to Monitor AWS Lambda with CloudWatch

Since Amazon released Lambda in late 2014, the notion of serverless applications and function-as-a-service has steadily gained steam. Being able to focus on application code and simplifying infrastructure management is alluring, but traditional monitoring methods are no longer applicable. With less visibility, it becomes even more important to take advantage of the available monitoring methods. In this post, we discuss those monitoring methods, CloudWatch Metrics and CloudWatch Logs.

Using t2.unlimited to Increase Packet Limitations

I set out to find a credit mechanism or hard-coded limit in packets per second in AWS EC2. After all my findings set out in this series so far, I had one more test to perform around t2.unlimited. I wanted to see how “unlimited” it is and the difference it makes in packet throughput on capable instance types. This post is about my findings.