Operations | Monitoring | ITSM | DevOps | Cloud

Distributed Machine Learning With PySpark

Spark is known as a fast general-purpose cluster-computing framework for processing big data. In this post, we’re going to cover how Spark works under the hood and the things you need to know to be able to effectively perform distributing machine learning using PySpark. The post assumes basic familiarity with Python and the concepts of machine learning like regression, gradient descent, etc.

Key Kubernetes Concepts

Cloud computing, containerization, and container orchestration are the most important trends in DevOps. Whether you’re a data scientist, software developer, or product manager, it’s good to know Docker and Kubernetes basics. Both technologies help you collaborate with others, deploy your projects, and increase your value to employers. In this article, we’ll cover essential Kubernetes concepts. There are a lot of Kubernetes terms, which can make it intimidating.

Kubernetes vs Docker: How to Choose

If you’re thinking about using containers to manage an application, there are a lot of options for technologies to use. It can be difficult to even know where to begin to make a decision. One common question is whether someone should use Docker vs Kubernetes for managing their application containers. This is a misleading question. In truth, Docker and Kubernetes aren’t competing technologies. There’s no need for them to face off.

Easy A/B Testing with PlanOut

So you want to A/B test your web app. The idea is simple, but the details can get messy, and you don’t want to re-invent the wheel. Services like Optimizely are pretty good, but they can be expensive and full of features you don’t need immediately. In this post, we’ll share how Sentry wrote an experimentation system with minimal work.

SFTT #2: Using Cognito In Serverless Integration Testing

Welcome to the second edition of Serverless from the Trenches, our series of bite-sized blog posts aimed at developers and DevOps working in serverless. Each article will focus on a different technique or tool to solve a real-world problem and – hopefully – help make your work in serverless more productive. This week we look at how to add Cognito to your integration tests flow, making for true black box testing.

The New Rules of Sampling

One of the most common questions we get at Honeycomb is about how to control costs while still achieving the level of observability needed to debug, troubleshoot, and understand what is happening in production. Historically, the answer from most vendors has been to aggregate your data–to offer you calculated medians, means, and averages rather than the deep context you gain from having access to the actual events coming from your production environment.