Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Monitoring Kafka performance metrics

Kafka is a distributed, partitioned, replicated, log service developed by LinkedIn and open sourced in 2011. Basically it is a massively scalable pub/sub message queue architected as a distributed transaction log. It was created to provide “a unified platform for handling all the real-time data feeds a large company might have”.Kafka is used by many organizations, including LinkedIn, Pinterest, Twitter, and Datadog. The latest release is version 2.4.1.

Collecting Kafka performance metrics

If you’ve already read our guide to key Kafka performance metrics, you’ve seen that Kafka provides a vast array of metrics on performance and resource utilization, which are available in a number of different ways. You’ve also seen that no Kafka performance monitoring solution is complete without also monitoring ZooKeeper. This post covers some different options for collecting Kafka and ZooKeeper metrics, depending on your needs.

Monitoring Kafka with Datadog

Kafka deployments often rely on additional software packages not included in the Kafka codebase itself—in particular, Apache ZooKeeper. A comprehensive monitoring implementation includes all the layers of your deployment so you have visibility into your Kafka cluster and your ZooKeeper ensemble, as well as your producer and consumer applications and the hosts that run them all.

Monitor Jenkins jobs with Datadog

Jenkins is an open source, Java-based continuous integration server that helps organizations build, test, and deploy projects automatically. Jenkins is widely used, having been adopted by organizations like GitHub, Etsy, LinkedIn, and Datadog. You can set up Jenkins to test and deploy your software projects every time you commit changes, to trigger new builds upon successful completion of other builds, and to run jobs on a regular schedule.

Release 1.21: Introducing new collectors, faster exporters, and improved security

We’re in the middle of a scary, uncertain time, and we hope those of you reading are staying safe and healthy. Despite the current challenges, the 40+ members of the remote-first Netdata team have been hard at work on the next version of the Netdata Agent: v1.21.0. This release is foundational: While we do have fantastic new collectors and three new ways to export your metrics for long-term storage, many of the most significant changes aren’t even those you’ll notice.

Rancher 2.4 Keeps Your Innovation Engine Running with Zero Downtime Upgrades

Delivering rapid innovation to your users is critical in the fast-moving world of technology. Kubernetes is an amazing engine to drive that innovation in the cloud, on-premise and at the edge. All that said, Kubernetes and the entire ecosystem itself changes quickly. Keeping Kubernetes up to date for security and new functionality is critical to any deployment.

Continuously Optimize Your AWS Resources with CloudFormation

If you have discovered that your application demand changes over time, you’re probably wondering how you can continuously adjust your cloud capacity in accordance to application demand. If you use CloudFormation, then you’re in luck! This article walks through how you can update your template code to automatically implement infrastructure adjustments, periodically.

Block Security Vulnerabilities from Entering Your Code

As continuous software deployments grow and become the accepted standard, security measures gain even more importance. From development and all the way through to production, security requirements should be adopted by all teams in an organization. JFrog IDE integrations provide security and compliance intelligence to the developer right from within their IDE.

Reliable, Self-Healing Kubernetes Explained

One of the great benefits of Kubernetes is its self-healing ability. If a containerized app or an application component goes down, Kubernetes will instantly redeploy it, matching the so-called desired state. But what if a Kubernetes component or a node goes down? Kubernetes doesn’t monitor itself nor does it have access to your infrastructure. And, guess what.