Operations | Monitoring | ITSM | DevOps | Cloud

May 2023

Crash Course on Building and Monitoring AWS CDK Apps

In this webinar, learn how to use the AWS Cloud Development Kit (CDK) to build a complex microservice-based application and implement distributed tracing to monitor it. You'll be able to follow along with Thorsten Höeger, Cloud Automation Evangelist, and AWS CDK expert Michele Mancioppi, as they live-code an application that uses AWS Lambda with Node.js, and Amazon ECS with Java. Once built, you'll learn how you can apply distributed tracing to any AWS CDK-based application, in just a single line of code.

Leveraging OpenTelemetry to Fix Flaky Integration Tests

At Lumigo, we heavily depend on a set of tests to deploy code changes fast. For every pull request opened, we bootstrap our whole application backend and run a set of async parallel checks mimicking users’ use cases. We call them integration tests. These integration tests are how we ensure: Recently, we changed our old “traditional log traversing” of integration tests into *amazing* OpenTelemetry traces graphs.

Kubernetes Design Patterns For Optimal Observability

Technology is a fast-moving commodity. Trends, thoughts, techniques, and tools evolve rapidly in the software technology space. This rapid change is particularly felt in the software the engineers in the cloud-native space make use of to build, deploy, and operate their applications. One particular area where we see rapid evolution in the past few years/months is Observability.

Faster Debugging with Collaborative Troubleshooting Tools

As developers we understand the critical role teamwork and collaboration play in solving complex problems. Often, it’s that second set of eyes that uncovers an additional issue or sheds light on the root cause of a stubborn bug. Effective collaboration becomes a critical factor in determining a team’s success or failure, especially when debugging or troubleshooting problematic issues within complex applications.

Troubleshooting Slow Draining SQS Queues

This post is part of an ongoing series about troubleshooting common issues with microservice-based applications. Read the previous one on intermittent failure. Queues are an essential component of many applications, enabling asynchronous processing of tasks and messages. However, queues can become a bottleneck if they don’t drain fast enough, causing delays, increasing costs, and reducing the overall reliability of the system.