Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Scaling Fleet and Kubernetes to a Million Clusters

We created the Fleet Project to provide centralized GitOps-style management of a large number of Kubernetes clusters. A key design goal of Fleet is to be able to manage 1 million geographically distributed clusters. When we architected Fleet, we wanted to use a standard Kubernetes controller architecture. This meant in order to scale, we needed to prove we could scale Kubernetes much farther than we ever had.

Knowing When to Say Goodbye

By design and tradition, telecoms networks are built to last. But in a world where the rate of innovation seems to be accelerating, the end result is that a lot of legacy infrastructure needs to keep pace with, and accommodate, multiple ‘next generation’ phases. How long this can be maintained before the imperative to rip and replace becomes impossible to ignore is the multi-million-dollar question.

How to Manage AWS Cost Outliers

A few years ago, we realized that spending in our AWS product test environment had jumped significantly from one month to the next. We drilled down into the issue and traced it to some RDS database instances that had been spun up to test new product features. No one realized that these expensive instances were left running after the tests were complete, and subsequently racking up charges for several months.

New Market Research Shows More than 80% of Global 2000 Companies Planning to Leverage the Cloud Intend to Maintain On-Premises Environments

San Jose, CA, November 11, 2020 – A vast majority (84%) of companies considered “digital leaders” by IDC, a leading provider of global IT research and advice, are turning to a hybrid cloud approach as they adopt public cloud services to improve IT service delivery.

Introducing the Cloudsmith Terraform Provider

In this blog, we will go through an example of how you can use the Cloudsmith Terraform Provider to provision resources in Cloudsmith, such as repositories and entitlement tokens. HashiCorp Terraform is an awesome Continuous Configuration Automation tool. It is used to provision, update and manage infrastructure resources such as Cloud instances, containers, physical machines and more. It is a firm favourite among developers, due to its brilliant community and mix of power and simplicity.

Puppet's path to IPO and welcome to our new board members

We’ve had an exciting year here at Puppet, and although it’s not the year we could have expected, I’m encouraged and inspired every day by the resilience of our team, our commitment to each other, and our drive to help customers navigate through so much uncertainty and change.

How Netdata gets you from 0 to monitoring in minutes

Netdata is zero-configuration monitoring. It’s a principle that we’ve stood behind since the project’s beginning, when it was only our CEO Costa trying to solve a “painful, real-world problem,” and it’s one we stand by today. Our insistence on zero-configuration guides every product decision we make, every grooming process, and every React component our frontend teams design.

Welcome to Netdata's community repository: Consul, Ansible, ML

On our journey to democratize monitoring, we are proud to have open source at the core of both our products and our company values. What started as a project out of frustration for lack of existing alternatives (see anger-driven development), quickly became one of the most starred open-source projects on all of GitHub.

Why modern testing requires Chaos Engineering

Modern applications are changing, and traditional testing practices are no longer up to the task. Learn more about the changing landscape of QA and how Chaos Engineering provides the necessary framework for testing modern applications. Chaos and Reliability Engineering techniques are quickly gaining traction as essential disciplines to building reliable applications. Many organizations have embraced Chaos Engineering over the last few years.