Prometheus, the de facto standard for Kubernetes monitoring, works well for many basic deployments, but managing Prometheus infrastructure can become challenging at scale. As Kubernetes deployments continue to play a bigger role in enterprise IT, scaling Prometheus for a large number of metrics across a global footprint has become a pressing need for many organizations.
The need for relevant and contextual telemetry data to support online services has grown in the last decade as businesses undergo digital transformation. These data are typically the difference between proactively remediating application performance issues or costly service downtime. Distributed tracing is a key capability for improving application performance and reliability, as noted in SRE best practices.