%term

The latest News and Information on Service Reliability Engineering and related technologies.

Amazon SQS Metrics: Monitor, Debug, and Optimize Your Message Queues

Jun 25, 2025 By Anjali Udasi In Last9

Message queues quietly take care of a lot—buffering workloads, smoothing traffic spikes, and keeping services connected. But they don’t always get much attention until something feels off. Amazon SQS offers a solid set of metrics to help you understand how your queues are doing, whether you’re scaling well or nearing limits. This blog breaks down the key SQS metrics: where to find them, what they mean, and how to respond when things start to shift.

Read Post

Last9

Read more about Amazon SQS Metrics: Monitor, Debug, and Optimize Your Message Queues

How to Configure Docker's Shared Memory Size (/dev/shm)

Jun 25, 2025 By Faiz Shaikh In Last9

Your Node.js app runs fine on your machine. But inside Docker? You start getting weird crashes—ENOSPC: no space left on device. Chrome headless tests fail out of nowhere. PostgreSQL throws shared memory errors under load. The problem? It’s probably /dev/shm, the shared memory volume Docker sets up by default. Most containers get just 64MB of space here.

Read Post

Last9

Read more about How to Configure Docker's Shared Memory Size (/dev/shm)

11 Best Log Monitoring Tools for Developers in 2025

Jun 24, 2025 By Anjali Udasi In Last9

Your checkout API just started throwing 500s during peak traffic. You SSH into production, tail logs across six microservices, and realize the database timeout buried in service's logs is causing cascade failures. Two hours later, you've fixed it, but you're thinking: "There has to be a better way." There is. Log monitoring tools centralize logs from your entire stack, making debugging systematic instead of archaeological.

Read Post

Last9

Read more about 11 Best Log Monitoring Tools for Developers in 2025

Prometheus Logging Explained for Developers

Jun 20, 2025 By Prathamesh Sonpatki In Last9

Running apps in production? You need visibility fast. Traditional logging gives you scattered events. Prometheus gives you structured, queryable data that scales. In this guide, we’ll break down how to use Prometheus for logging-style observability, where it fits in your stack, and how to plug it into tools like Grafana or your cloud-native setup.

Read Post

Last9

Read more about Prometheus Logging Explained for Developers

Docker Stop vs Kill: When to Use Each Command

Jun 19, 2025 By Anjali Udasi In Last9

When a container starts consuming excessive memory or becomes unresponsive, you need a way to shut it down. The two primary options — docker stop and docker kill,both terminate containers, but they operate differently and have different implications. The key difference: docker stop sends SIGTERM for a graceful shutdown, then escalates to SIGKILL if the process doesn’t exit in time. docker kill skips straight to SIGKILL, terminating the container immediately.

Read Post

Last9

Read more about Docker Stop vs Kill: When to Use Each Command

Access Logs: Format Specification and Practical Usage

Jun 18, 2025 By Anjali Udasi In Last9

Your server's been logging everything—it’s just easy to overlook until something breaks. Every incoming request, database call, or auth check ends up in your access logs. They’re not flashy, but they quietly document every interaction your system handles. For developers, they’re often the most reliable starting point when things go wrong. In this blog, we'll take a look at what an access log is, its format, types, and a few best practices.

Read Post

Last9

Read more about Access Logs: Format Specification and Practical Usage

Log Management and Query Optimization in Kibana

Jun 18, 2025 By Faiz Shaikh In Last9

When troubleshooting with the Elastic Stack, Kibana is often the interface you’ll rely on to query and visualize logs. It doesn’t change the data—it just makes it searchable and a bit easier to work with under pressure. If you’re investigating an outage, tracking performance issues, or trying to correlate events across services, Kibana’s log exploration tools can speed up the process, assuming they’re configured and used well.

Read Post

Last9

Read more about Log Management and Query Optimization in Kibana

Network Latency: Types, Causes, and Fixes

Jun 17, 2025 By Anjali Udasi In Last9

Sometimes your API call takes a few seconds longer than expected. Or users start reporting slow page loads. One of the most common reasons? Network latency.

Read Post

Last9

Read more about Network Latency: Types, Causes, and Fixes

Azure CDN for Static Assets, APIs, and Front Door

Jun 17, 2025 By Faiz Shaikh In Last9

If your users are spread across the globe but your servers are sitting in Virginia, you’ll probably hear complaints about slow load times, especially from places like Australia. CDNs fix this by caching static assets closer to where your users are. Azure CDN does exactly that, and it fits well if you're already using Azure services. You can hook it up to Blob Storage, App Services, or your origin. This guide covers how to set it up, what to expect, and how to know it’s working.

Read Post

Last9

Read more about Azure CDN for Static Assets, APIs, and Front Door

Everything You Need to Know About Event Logs

Jun 13, 2025 By Faiz Shaikh In Last9

Your code passes locally, CI is green, and the deploy goes through. Then production throws a 500, and the trace isn’t helpful. And here, event logs help. A log captures timestamped records of what the app did HTTP requests, DB queries, cache misses, retries, failures. These entries give you enough context to debug without reproducing the issue locally. Especially when dealing with distributed systems, logs are often the only consistent source of truth.

Read Post