Operations | Monitoring | ITSM | DevOps | Cloud

Blog

Simplifying Cloud Management With OpsRamp: Part 1

Analyst firm IDC forecasts that organizations will collectively spend $370 billion on public cloud services and infrastructure in 2022. Given the skyrocketing adoption of public cloud platforms, enterprises are using a number of approaches (lift and shift, replatforming, and refactoring) to smoothly transition on-prem workloads to the cloud.

Monitor Alibaba Cloud with Datadog

Alibaba Cloud provides a comprehensive suite of cloud computing services to power businesses across the globe. We are excited to announce that our new integration with Alibaba Cloud is now in public beta. While the Datadog Agent has always been able to provide visibility into Alibaba Cloud instances, this new integration now enables you to also monitor the health and performance of Alibaba Cloud services (load balancers, managed databases, and more) in Datadog.

4 Best Practices for choosing your DevOps tools

If you google “DevOps tools,” you’ll see a dizzying litany of software applications, all promising to simplify your life as a DevOps engineer. This can be an intimidating experience — not only because there are so many DevOps solutions available that it can be difficult to know which ones are the best for your needs, but also because the idea of having to learn and “carry around” so many tools is itself unnerving.

Prometheus metrics / OpenMetrics code instrumentation

In the following example-driven tutorial we will learn how to use Prometheus metrics / OpenMetrics to instrument your code whether you are using Golang, Java, Python or Javascript. We will cover the different metric types and provide readily executable code snippets. Prometheus is an open source time series database for monitoring that was originally developed at SoundCloud before being released as an open source project.

NetFlow Basics: An Introduction to Monitoring Network Traffic

To fully understand what NetFlow is and why it’s used for network monitoring, we first need to know what a flow is. When computers need to talk to one another they establish communication channels, commonly referred to as connections. (Technically speaking, these communication channels can only be called connections when the TCP protocol is involved.) A flow refers to any connection or connection-like communication channel.

GrafanaCon L.A. Recap: Grafana 6.0, LGTM, and More!

The rest of the city may still have been in a post-Oscars haze, but over 350 monitoring mavens gathered in downtown L.A. bright and early on Feb. 25 to kick off GrafanaCon 2019. The next two days were filled with 40+ talks, including Grafana end user stories from companies like Bloomberg and Tinder.

How to Monitor Amazon DynamoDB with CloudWatch

Amazon DynamoDB is a key-value and document database that allows you to easily scale to huge numbers of records with single digit millisecond performance. However, since it’s a managed service, you have less visibility with traditional monitoring tools. As such, it becomes even more important to take advantage of the available monitoring tools in AWS. In this post, we’ll explain how to use CloudWatch to monitor DynamoDB and what is important to watch.

Quantifying the Digital Employee Experience

We’ve talked to a lot of people about their company’s digital employee experience the past few years – from C-suite executives and board members looking to make sure they’re doing what they can to make work lives better and retain staff, to the actual CIOs and IT managers tasked with changing and improving their employees’ workplace experience. We’ve even heard from employees on the front lines every day about what works and what doesn’t at their companies.

Detecting the Kubernetes API Server DoS Vulnerability (CVE-2019-1002100)

Recently, a new Kubernetes related vulnerability was announced that affected the kube-apiserver. This was a denial of service vulnerability where authorized users with write permissions could overload the API server as it is handling requests. The issue is categorized as a medium severity (CVSS score of 6.5) and can be resolved by upgrading the kube-apiserver to v1.11.8, v1.12.6, or v1.13.4.