Operations | Monitoring | ITSM | DevOps | Cloud

July 2023

How we slashed detection and resolution time in half (Salt Security)

Salt Security had deployed OpenTelemetry but found it insufficient. So the company engineers evaluated Helios, which visualizes distributed tracing for fast troubleshooting. My role as the Director of Platform Engineering at Salt Security lets me pursue my passion for cloud-native tech and for solving difficult system-design challenges. One of the recent challenges we solved had to do with visibility into our services. Or lack thereof.

Debugging and troubleshooting microservices in production-All you need to know

What do you do when things break in production? Debugging microservices isn’t a walk in the park. Microservices are designed to be loosely coupled, which makes them more scalable and resilient, but also more difficult to debug. When a problem occurs in a microservices app, it can be difficult to track down the source of the problem. When the problem is in production, the clock is ticking and you have to resolve the issues fast.

Lambda monitoring: Combining the three pillars of observability to reduce MTTR

Observability & monitoring can be challenging when it comes to distributed applications, serverless architectures being a typical examples of that. As with any other service that we run, we need to understand how our Lambda functions are executed, how to identify issues, and how to optimize performance.