What is OpenTelemetry Collector
What is OpenTelemetry Collector, Architecture, Deployment and Getting started.
The latest News and Information on Service Reliability Engineering and related technologies.
What is OpenTelemetry Collector, Architecture, Deployment and Getting started.
How JCB improves team structure, risk management, and application and platform development.
InfluxDB vs Thanos: Overview, Pros and Cons, and Differences.
Site reliability engineers manage a lot, and often in incredibly high-stakes environments. Remember that scene from "The Matrix" where Neo dodges bullets in slow motion? Of course you do. As an SRE, it can feel like you're the person getting hit by those bullets, frantically trying to investigate performance issues, automate away toil, and support the engineers around you, all before the next wave of attacks.
As new incidents emerge, there are often many unknowns about the size, severity, and cause of the problem. Sometimes it’s not clear if the problem is an incident at all. That’s where introducing a triage stage to your incident management process can help. In this post, we’ll look at the benefits of adding a triage layer to your incident management, and how Rootly’s Triage feature allows you to seamlessly transition from triage to real incident (or false alarm).
If all companies are software companies, all companies need better Observability to understand how performative their software is.
Comparing Prometheus vs. VictoriaMetrics (VM) - Scalability, Performance, Integrations.
Comparing Prometheus vs. Cortex - Scalability, Cost, Performance, Known Weaknesses.
Take back control of your Monitoring with Levitate - a managed time series data warehouse.