At Lumigo, we heavily depend on a set of tests to deploy code changes fast. For every pull request opened, we bootstrap our whole application backend and run a set of async parallel checks mimicking users’ use cases. We call them integration tests. These integration tests are how we ensure: Recently, we changed our old “traditional log traversing” of integration tests into *amazing* OpenTelemetry traces graphs.
As developers we understand the critical role teamwork and collaboration play in solving complex problems. Often, it’s that second set of eyes that uncovers an additional issue or sheds light on the root cause of a stubborn bug. Effective collaboration becomes a critical factor in determining a team’s success or failure, especially when debugging or troubleshooting problematic issues within complex applications.
This post is part of an ongoing series about troubleshooting common issues with microservice-based applications. Read the previous one on intermittent failure. Queues are an essential component of many applications, enabling asynchronous processing of tasks and messages. However, queues can become a bottleneck if they don’t drain fast enough, causing delays, increasing costs, and reducing the overall reliability of the system.