|
By Prathamesh Sonpatki
ClickHouse swallows high-cardinality telemetry at ingest, then breaks at query time weeks later. Here is what fails, and how we keep it fast in production. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.
|
By Prathamesh Sonpatki
ClickHouse LowCardinality cuts storage and speeds up queries on low-cardinality columns, but backfires on trace IDs. How to tell the difference. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.
|
By Prathamesh Sonpatki
Drop a JAR on the JVM. Get distributed tracing, RxJava context propagation, log-trace correlation, and Vert.x internal metrics. No code changes. No Maven dependency. Java 8–21. Inside the design of last9/vertx-opentelemetry v2.3.4. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.
|
By Prathamesh Sonpatki
Why ECS containers collapse under service.name = aws_ecs and how to fix it for both EC2 launch type and Fargate, including the resource-vs-log-record pitfall that quietly breaks log filtering. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.
|
By Faiz Shaikh
What actually works for Kubernetes monitoring at scale — not what looks good in a vendor demo with a five-pod cluster.
|
By Prathamesh Sonpatki
LocalStack lets you run SQS, Lambda, and S3 locally in Docker — but there's a hidden trap: OpenTelemetry's default AWS propagator doesn't work with free LocalStack. Here's how to set up end-to-end local testing with working trace propagation. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.
|
By Prathamesh Sonpatki
SQS doesn't propagate trace context automatically. You instrument both sides, deploy, and get two disconnected traces. This post shows how to wire them into one waterfall — and the ESM format gotcha that silently breaks it every time. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.
|
By Prathamesh Sonpatki
OpenTelemetry's GenAI instrumentation gives you spans and token counts. It does not give you conversations, workflow cost rollups, or prompts visible in your dashboard. last9-genai is an OTel extension that fills those three gaps — without replacing your existing observability stack. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.
|
By Prathamesh Sonpatki
Health check endpoints generate thousands of identical, useless spans per day. Here are two production-ready approaches to filter them from your Python OTel traces — and the correctness trap most implementations miss. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.
|
By Prathamesh Sonpatki
Argo Rollouts exposes Prometheus metrics on port 8090 — but the docs lie about which labels exist. Here's how to scrape them into Last9, build a canary dashboard, and use Last9 as an automated AnalysisTemplate gate, including the auth and base64 gotchas. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.
Platform engineering provides powerful tools that handle a lot under the hood. Learn how to calculate your remaining error budget with a simple formula using real numbers and objective statements.
30 minutes of eating crow! Learn from our SLO mistakes at Weave. Discover pitfalls and shortcuts to doing it right the first time. Avoid our wrong, wrong, wrong, wrongs!
OpenTelemetry aims to link metrics to traces and logs, offering OpenCensus users a seamless migration path. Work with existing protocols like Prometheus. Leverage existing tooling without learning something completely new.
OpenTelemetry explained: standards, SDKs for various languages (Ruby, Python, Go), and middleware tools. Deploy these to pre-process data and send it to your destination.
Stop debugging infrastructure issues across multiple dashboards. See how Last9's Discover Infrastructure monitors K8s pods and traditional hosts together—with resource analysis, pod-level debugging, and AI that correlates app problems to infrastructure root causes. One setup (K8s + host monitoring) → Complete infrastructure visibility that connects to your services and jobs. No more blind spots between application performance and underlying resources.
Stop debugging background jobs with docker logs and prayer. See how Last9's Discover Jobs monitors async operations like APIs—with P95 latencies, error breakdowns, and operation-level traces for every job type.
Stop playing detective during incidents. See how Last9's Discover Services automatically builds your service map from traces, shows real-time dependencies, and lets you debug with both conversational AI and visual dashboards.
Use Last9 MCP in Claude Desktop or Cursor to analyze logs for a service, get recommendations on improving logging, and optimize log volumes. In this demo, the AI agent uses the `add_drop_rules` MCP tool from Last9 to filter out unnecessary logs and reduve volumes by ~60%.
Demo of using the `get_exceptions` Last9 MCP tool.
- June 2026 (2)
- May 2026 (3)
- April 2026 (16)
- February 2026 (7)
- January 2026 (4)
- December 2025 (13)
- November 2025 (14)
- October 2025 (13)
- September 2025 (15)
- August 2025 (17)
- July 2025 (34)
- June 2025 (23)
- May 2025 (38)
- April 2025 (54)
- March 2025 (52)
- February 2025 (46)
- January 2025 (56)
- December 2024 (38)
- November 2024 (26)
- October 2024 (17)
- September 2024 (24)
- August 2024 (11)
- July 2024 (4)
- June 2024 (9)
- May 2024 (2)
- April 2024 (2)
- March 2024 (3)
- February 2024 (2)
- January 2024 (4)
- December 2023 (5)
- November 2023 (11)
- October 2023 (24)
- September 2023 (5)
- August 2023 (8)
- July 2023 (13)
- June 2023 (10)
- May 2023 (11)
- April 2023 (3)
- March 2023 (4)
- February 2023 (1)
- January 2023 (2)
- December 2022 (1)
- November 2022 (1)
- February 2022 (1)
Last9 provides tools to improve Reliability in large-scale cloud-native environments.
Our open-standards-based tools provide visibility into the Rube Goldberg of micro-services. We take away the toil of managing a time series database by dramatically reducing your costs and improving developer productivity.
Levitate is our time series metrics & events warehouse designed for scale and high cardinality. Our warehousing capabilities provide necessary control levers to ensure cost-efficient data growth management, surpassing traditional storage solutions.
Start your observability journey today with Levitate. A Managed Time Series Data Warehouse that SREs trust.