The latest News and Information on Distributed Tracing and related technologies.
Running and troubleshooting production services requires deep visibility into your applications and infrastructure. While basic logs and metrics are available out of the box with Google Cloud Compute Engine (GCE), capturing advanced data used to require the installation of both a metrics agent and a logging agent.
Slack experienced meteoric growth between 2017 and 2020—but that level of growth came with growing pains. In his talk at the 2021 o11ycon+hnycon, Frank Chen (LinkedIn), a Slack Senior Staff Engineer, detailed one of Slack’s biggest pain points in that period: flaky tests. A flaky test returns both a passing and failing result despite no changes in the code. At one point, between 2017 and 2020, Slack’s flaky test rate reached as high as 50%.