Performance testing plays a critical role in application reliability. It enables developers and engineering teams to catch issues before they reach production or impact the end-user experience. Understanding performance test results and acting on them, however, has always been a challenge. This is due to the visibility gap between the black-box data from performance testing and the internal white-box data of the system being tested.
Enterprise IT is just a different animal. Whether it’s operating at scale, undertaking massive migrations, working across scores of teams, or addressing tight security requirements, engineers at these organizations can face different obstacles than their counterparts at smaller organizations and startups.
Alexander is Senior SRE at Prezi, a video and visual communications software company. As a team, the Prezi SREs provide multiple services within the company. One of those is the observability stack where Prezi heavily relies on Grafana. Companies are always evolving to run more smoothly, serve their customers better, and operate in a way that is cost-effective.
When faced with an incident, there are two areas that demand your immediate attention: the incident investigation, and the cross-functional coordination needed to resolve the issue. Grafana Incident helps with the collaboration by providing a central hub for communication across teams that seamlessly integrates with the tools you are already using, such as Slack or Microsoft Teams. But how can you best use your telemetry data to debug your application and bring your systems back online?
Do you want to try Grafana for application observability but don’t have time to adapt your application for it? Often, to properly instrument an app, you have to add a language agent to the deployment or package. And, in languages like Go, proper instrumentation means manually adding tracepoints. Either way, you have to redeploy to your staging or production environment once you’ve added the instrumentation.