Operations | Monitoring | ITSM | DevOps | Cloud

DevOps

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Catchpoint's SRE Report 2020 - The Highlights

Our 2020 SRE Report is ready! We launched the SRE survey 2020 this January with the goal of understanding the current state of SRE. The survey covered a range of topics including: As we neared the end of the survey period, the SRE community was in the midst of a sudden change. SRE teams were forced to migrate to all-remote IT. We realized we would not be able to provide an accurate analysis without considering this shift in how SRE teams were operating in this new environment.

Data science workflows on Kubernetes with Kubeflow pipelines: Part 1

Kubeflow Pipelines are a great way to build portable, scalable machine learning workflows. It is one part of a larger Kubeflow ecosystem that aims to reduce the complexity and time involved with training and deploying machine learning models at scale. In this blog series, we demystify Kubeflow pipelines and showcase this method to produce reusable and reproducible data science.

Uptrends and your CI/CD processes

If you’ve been following along, the last two articles explained the CI/CD (Continuous Integration/Continuous Delivery) processes. In this article, we look at how Uptrends fits into those processes. Uptrends can fit into your CI/CD process in several different ways. For example, you may want to use Uptrends for the testing and monitoring portions, include your monitor updates as part of your automation processes, or both.

Announcing Status Checks to Ensure Safe Chaos Engineering Scenarios

One of the most important aspects of any Chaos Engineering program is knowing that every experiment is being run safely. And one of the simplest ways to ensure safe experiments is by having safeguards that prevent running chaos experiments on a system that is unhealthy or has an incident in progress. Today, Gremlin is excited to announce Status Checks, which run before you kick off a Chaos Engineering Scenario in order to verify your system is in a steady state.

How to Expand Data Collection for InfluxDB with CloudFormation Templates

In a previous post, I demonstrated how to call InfluxDB APIs from AWS Lambda, but the setup is fairly manual and the results are not portable. Ideally, we as a community can expand and share ways to collect and process time series data. To that end, I want to share a CloudFormation template. CloudFormation is AWS’ infrastructure as code service that lets you define almost any AWS component in a configuration file.