Operations | Monitoring | ITSM | DevOps | Cloud

Analytics

Build a Data Streaming Pipeline with Kafka and InfluxDB

InfluxDB and Kafka aren’t competitors – they’re complimentary. Streaming data, and more specifically time series data, travels in high volumes and velocities. Adding InfluxDB to your Kafka cluster provides specialized handling for your time series data. This specialized handling includes real-time queries and analytics, and integration with cutting edge machine learning and artificial intelligence technologies. Companies like as Hulu paired their InfluxDB instances with Kafka.

Using Cribl Stream to Correct Misconfigured Data in Datadog

The challenge for every organization is gathering actionable observability information from all your systems, in a timely manner, without creating a substantial operational burden for the teams managing the collection tooling. While each observability solution has its unique benefits and challenges, the one common burden expressed by teams is the management of the metadata of the metrics, traces, and logs.

Data Visualization for Everyone: How To Simplify the Process

Nowadays, data is being generated at an unprecedented pace. Data is collected everywhere, from various social media platforms to e-commerce websites. This explosion of data has made it almost impossible to make sense of it through traditional methods. This is where data visualization comes into the picture. Data visualization enables companies to interpret vast amounts of information and draw conclusions quickly. It allows users to analyze data in a more accessible and straightforward way.

Pick 3 for Your Data Management: Speed, Choice, and Flexibility

Data growth has significantly out-pacing budgets; the products we use, have to do more. This is where optimization comes into play. Generally, optimization is associated with reduction which may be intimidating…what if something important is reduced? How can you identify what should be reduced? Reduction isn’t about removing context, but about removing repetitive data, meaningless fields, or even flattening JSON.

Optimization Without Recommendations: Automating Your Cost Optimization on Amazon EKS

Learn how Pepperdata uses machine learning to provide Continuous Intelligent Tuning automatically to your Amazon EKS applications, helping your platform team recover wasted capacity and ultimately reduce your spend for cloud resources.

Navigating Data Overload with Cribl

So many businesses today are playing “Hungry, Hungry, (Data) Hippo,” devouring every marble of information they can get their hands on. While it seems like every company has a robust data aggregation system, what most companies don’t have is an efficient way to control what data they store and where that data goes. We all want to make data-driven business decisions, but sorting through tons of data to find useful business insights can be like finding a needle in a whole farm.

Image recognition with Python, OpenCV, OpenAI CLIP and pgvector

In this video you’ll learn how to build an offline face recognition pipeline to find faces on top of complex pictures. The full written explanation is available in the dedicated article The pipeline will use: Python and OpenCV to detect faces within complex pictures Python and an OpenAI CLIP model to calculate the face embeddings PostgreSQL and the pgvector extension to store the embeddings and calculate distance across them.

Mage.ai for Tasks with InfluxDB

Any existing InfluxDB user will notice that InfluxDB underwent a transformation with the release of InfluxDB 3.0. InfluxDB v3 provides 45x better write throughput and has 5-25x faster queries compared to previous versions of InfluxDB (see this post for more performance benchmarks). We also deprioritized several features that existed in 2.x to focus on interoperability with existing tools. One of the deprioritized features that existed in InfluxDB v2 is the task engine.

The Plan for InfluxDB 3.0 Open Source

The commercial version of InfluxDB 3.0 is a distributed, scalable time series database built for real-time analytic workloads. It supports infinite cardinality, SQL and InfluxQL as native query languages, and manages data efficiently in object storage as Apache Parquet files. It delivers significant gains in ingest efficiency, scalability, data compression, storage costs, and query performance on higher cardinality data.