Operations | Monitoring | ITSM | DevOps | Cloud

D2IQ

Using Konvoy to Patch your Cluster Infrastructure (Part 1)

Recently we hit the infamous kmem bug in our internal Production Konvoy Cluster. We discovered that we were having this issue after users began reporting a particular CI Job was failing intermittently throughout the Cluster with the following error: From the Pod Logs: From the Kernel Logs.

Stabilizing Marathon: Part III

So far we covered team culture which amplifies our code culture and design. It was kind of abstract so far and you’ll be forgiven if you skipped right a way to this part. I will cover our test and release pipeline, the thing that probably has had the biggest impact on Marathon’s stability. The pipeline enabled us to discover issues before our users did. I will first give an overview of the pipeline stages and dive deep into the Loop. You will soon see what I meant by that.

Stabilizing Marathon: Part II

Part I covered our team culture which applies to many different types of work and teams. This part will cover our software engineering best practices that help us stabilize Marathon. Marathon is written in Scala and makes heavy use of Akka Actors and Streams. I probably don’t have to mention that Scala’s type system and its immutable data structures avoid a lot of bugs before we even run unit tests.

Stabilizing Marathon: Part I

This is a review of the last three years that we spent stabilizing Marathon. Marathon is the central workload scheduler in DC/OS. Most of the time when you launch an app or a service on DC/OS, it is Marathon that starts it on top of Apache Mesos. Mesos manages the compute and storage resources and Marathon orchestrates the workload. We sometimes dub it the “init.d of DC/OS”. Being such an integral part of DC/OS, we must ensure that it keeps functioning.

Double Header: Konvoy 1.5 and Kommander 1.1 Are GA!

Today we made Konvoy 1.5 and Kommander 1.1 generally available. In January, D2iQ defined a 12 month roadmap for Kommander and Konvoy. With these newest releases focused on the Single Enterprise Experience, that mission is halfway complete. Here are some of the highlights of the latest releases.

Q&A with Ziff Media Group: Why They Made the Switch to Kubernetes

Today’s leading companies are one step ahead of their competitors as they adopt new tools and disciplines emerging from the cloud native landscape. That was the case for Ziff Media Group, which is a collection of several media web properties including pcmag.com, mashable.com, deals.com, offers.com, and more.

D2IQ

D2iQ is your trusted guide to the cloud native landscape. We simplify the choices you need to make around infrastructure, technology, and support so you can drive smarter and more reliable deployments.

O'Reilly eBook: Cloud Native Containers and Next-Gen Apps

Developers often struggle when first encountering the cloud. Learning about distributed systems, becoming familiar with technologies such as containers and functions, and knowing how to put everything together can be daunting. With this practical guide, you'll get up to speed on patterns for building cloud native applications and best practices for common tasks such as messaging, eventing, and DevOps. Authors Boris Scholl, Trent Swanson, and Peter Jausovec describe the architectural building blocks for a modern cloud native application. You'll learn how to use microservices, containers, serverless computing, storage types, portability, and functions.