Operations | Monitoring | ITSM | DevOps | Cloud

DevOps

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

How Does Google Handle Critical Incidents?

While there are some very good sources out there on how to manage a critical incident, Google also wrote a chapter on incident management in their book, “Site Reliability Engineering”. In this chapter, the folks at Google present their approach to a well-designed critical incident management process.

Securing a Web Application with AWS Application Load Balancer

I was recently called upon to secure an Nginx web server with HTTPS, and my goal was to set this up with a certificate obtained from AWS Certificate Manager. It took me a while to figure out how to get everything configured and working. Hopefully someone else who is attempting to do the same thing will read this and I can save you some time!

Intro to NGINX

If you've been following along with my posts, you have a sound introduction to Apache Web Server, how it functions, it's place in history, and how Sumo Logic can help you sort through the numerous logs provided. Apache Access and Error logs are integral to understanding the traffic patterns and issues your users face when accessing your web applications. Sumo Logic helps administrators parse through logs, isolate issues, and determine the root causes of errors.

How to Decode Your AWS Bill (and What's within DevOps' Control)

The typical AWS bill, otherwise known as the AWS Cost and Usage Report, includes line items that are useful to both finance and DevOps. However, many of the metrics that are within engineers’ and cloud architects’ control aren’t so simple to discover. To make cost a first-class operational metric for DevOps, teams need visibility into the data that’s relevant to engineering activity.

The Mythical 'Average' IT Shop

In health care, doctors know the average man or average woman is in fact, mythical. Everyone has their own unique problems, capabilities and life stories. DNA can be altered by the environment. Even identical twins will have different health histories and different propensities to contract and avoid certain illnesses. Each individual is different. The same can be said of IT — no two shops are ever exactly alike.

Manual Rotation of Certificates in Rancher Kubernetes Clusters

Kubernetes clusters use multiple certificates to provide both encryption of traffic to the Kubernetes components as well as authentication of these requests. These certificates are auto-generated for clusters launched by Rancher and also clusters launched by the Rancher Kubernetes Engine (RKE) CLI.

Serverless Data Processing with AWS Step Functions, Part II.

Back in Part I of Deploying a Serverless Data Processing Workflow with AWS Step Functions, Nuatu mentioned one key benefit of using step functions is their visibility into business critical workflows. Outside stakeholders, support staff, and other engineers can look at a state machine execution in AWS or Stackery, and can easily understand the process.

Why Your Cloud Costs Might Be so High-and What to Do About It

Recent headlines surrounding big-name IPOs, such as that of Slack and Lyft, have highlighted the very real costs of operating in the cloud. Companies like these are on the hook to pay AWS and other public cloud vendors tens or hundreds of millions of dollars every year, just to run their services.