Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Website Management Tips for Business Owners

Whatever the nature of your business, whether you are a freelancer or an online retailer, the quality of your website is key to the success of your organisation. A well-run website serves as the online storefront of your organisation, acting as the destination point for people who are searching for the services you provide or the products you sell. Are you confident that you have all the main bases covered when it comes to the management of your website?

Monitor Amazon Managed Streaming for Apache Kafka with Datadog

Amazon Managed Streaming for Apache Kafka (MSK) is a fully managed service that allows developers to build highly available and scalable applications on Kafka. In addition to enabling developers to migrate their existing Kafka applications to AWS, Amazon MSK handles the provisioning and maintenance of Kafka and ZooKeeper nodes and automatically replicates data across multiple availability zones for high availability.

Docker Container Performance Metrics to Monitor

In Part 1 we’ve described what container monitoring is and why you need it. Because each container typically runs a single process, has its own environment, utilizes virtual networks, or has various methods of managing storage. Traditional monitoring solutions take metrics from each server and the applications they run. These servers and applications running on them are typically very static, with very long uptimes.

Docker Containers Management: Main Challenges & How to Overcome Them

Even though containers have been around for ages, it wasn’t until Docker showed up that containers really became widely adopted. Docker has made it easier, faster, and cheaper to deploy containerized applications. However, organizations that adopt container orchestration tools for application deployment face new maintenance challenges.

How Cortex Is Evolving to Ingest 1 Trillion Samples a Day

As the open-source monitoring system Prometheus grew, so did the need to grow its capacity in a way that is multi-tenant and horizontally-scalable, along with the ability to handle infinite amounts of long-term storage. So in 2016, Julius Volz and Tom Wilkie (who is now at Grafana Labs) started Project Frankenstein, which was eventually renamed Cortex.

Getting At The Good Stuff: How To Sample Traces in Honeycomb

(This is the first post by our new head of Customer Success, Irving.) Sampling is a must for applications at scale; it’s a technique for reducing the burden on your infrastructure and telemetry systems by only keeping data on a statistical sample of requests rather than 100% of requests. Large systems may produce large volumes of similar requests which can be de-duplicated.

Getting Started with InfluxDB and Pandas

InfluxData prides itself on prioritizing developer happiness. A large part of maintaining developer happiness is providing client libraries that allow users to interact with the database through the language and library of their choosing. Data analysis is the task most broadly associated with Python use cases, accounting for 58% of Python tasks, so it makes sense that Pandas is the second most popular library for Python users.

Tracking Systems Metrics with collectd

System administrators hold many key responsibilities within an IT organization. Most importantly, they must ensure that all systems, services, and applications are up, running, and performing as expected. When a system starts to lag or an application is down, the system administrators are called upon to troubleshoot and resolve the issue as quickly as possible to limit the impact on customers.