Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Lessons learned from running Kafka at Datadog

At Datadog, we operate 40+ Kafka and ZooKeeper clusters that process trillions of datapoints across multiple infrastructure platforms, data centers, and regions every day. Over the course of operating and scaling these clusters to support increasingly diverse and demanding workloads, we’ve learned a lot about Kafka—and what happens when its default behavior doesn’t align with expectations.

A Guide to the World of Cloud-Native Applications

It all started with monolith architecture; business logic, user interfaces, and data layers were stored in one big program. As tightly coupled applications, a simple update to the program meant recompiling the entire application and redistributing the program to all users. That led to the difficulty of maintaining consistent program versions and distribution across all clients in order to ensure stability and alignment. This made the monolith approach inefficient and cumbersome.

Sysdig Secure now integrates with AWS Security Hub

Today, Sysdig is proud to announce our integration with the AWS Security Hub. AWS Security Hub consolidates alerts and findings from multiple AWS services including, Amazon GuardDuty, Amazon Inspector, as well as from AWS Partner Network (APN) security solutions, which Sysdig is already a part of. This single pane of glass gives you a comprehensive view of high-priority security alerts and compliance status across AWS accounts.

Inventory Monitoring for Your Cloud Infrastructure

Managing agile software deployment for cloud infrastructure can be challenging. Deployments should be automated whenever possible to ensure consistent version management. Nevertheless, it can happen that identical software versions are not deployed to all servers. Such imperfect version management is a potential time-bomb. Distributed systems and microservices often rely on the deployment of the exact same software version installed on every cluster node.

Troubleshoot Faster with Anomaly Visualization

LogicMonitor is proud to announce anomaly visualization as an addition to our growing AIOps capabilities! With this new functionality, users are able to visualize anomalies that occur for a monitored resource and compare that anomaly to key historical signals, such as the past 24hrs, 7 days, or 30 days. Anomaly visualization complements LogicMonitor’s existing forecasting functionality and provides another layer of intelligence to better understand resource health.

Community Spotlight: BigQuery Plugin

The Grafana community comes up with some pretty cool stuff, and we’re hoping to spotlight some of it from time to time. Today, we’re starting with the BigQuery datasource plugin developed by the team at DoiT International. DoiT is a reseller of Google Cloud and AWS that helps companies either move from on premise to cloud or move from one cloud provider to another.

Kubernetes Security Essentials

Getting started with Kubernetes is really easy. In just a matter of minutes you can set up a new cluster with minikube, kops, Amazon EKS, Google Kubernetes Engine, or Azure Kubernetes Service. What isn’t so easy is knowing what to do after you set up your cluster and run a few apps. One of the most important parts of setting up a Kubernetes cluster is to make sure your cluster is secure. In this blog post, we will go over some of the strategies you can use to help secure your Kubernetes cluster.

Layman's Guide to Markdown on Mattermost

One of the biggest challenges with text-based communication is that oftentimes context, emotions, and intentions can get lost in translation. On the other hand, when we speak to one another, we have the advantage of inflection, tone, and body language to help convey our points effectively. Historically, adding context to text-based communications involved writing HTML code—something that was a bit too complicated for the average internet user.