SRE

The latest News and Information on Service Reliability Engineering and related technologies.

PromQL Cheat Sheet: Must-Know PromQL Queries

Sep 12, 2024 By Prathamesh Sonpatki, In Last9

This cheat sheet provides practical guidance for diagnosing issues and understanding trends.

Read Post

Last9

Read more about PromQL Cheat Sheet: Must-Know PromQL Queries

PromCon 2024 - Day 1

Sep 12, 2024 By Prathamesh Sonpatki In Last9

Get a quick overview of Day 1 at PromCon 2024, which featured significant announcements on Prometheus 3.0 and OpenTelemetry compatibility.

Read Post

Last9

Read more about PromCon 2024 - Day 1

When Alerts Don't Mean Downtime - Preventing SRE Fatigue

Sep 12, 2024 By Hrishikesh Barua In IncidentHub

A recent question in an SRE forum triggered this train of thought. I've paraphrased the question to reflect its essence. There is plenty to unravel here. My first reaction to this question was that the SRE who posted this is in a difficult place with systemic issues.

Read Post

IncidentHub

Read more about When Alerts Don't Mean Downtime - Preventing SRE Fatigue

The Role of Technology in Enhancing Incident Response Call Etiquette

Sep 11, 2024 By Vishal Padghan In Squadcast

The interconnectedness of today's business environment has significantly heightened the complexity of incident response (IR). The need for immediate action, precise communication, and real-time collaboration is more critical than ever. However, beyond the technical precision required in solving problems, there lies an often overlooked aspect of effective IR management: the etiquette of incident response calls.

Read Post

Squadcast

Read more about The Role of Technology in Enhancing Incident Response Call Etiquette

Tutorial 9 - Incident Responders

Sep 11, 2024 By Zenduty In Zenduty

Zenduty is a revolutionary incident management platform that gives you greater control and automation over the incident management lifecycle.

View Video

Zenduty

Read more about Tutorial 9 - Incident Responders

Tutorial 10 - Incident Roles

Sep 11, 2024 By Zenduty In Zenduty

Zenduty is a revolutionary incident management platform that gives you greater control and automation over the incident management lifecycle.

View Video

Zenduty

Read more about Tutorial 10 - Incident Roles

OpenTelemetry Protocol (OTLP): A Comprehensive Guide to Modern Observability.

Sep 11, 2024 By Gabriel Diaz In Last9

Learn about OTLP’s key features, and how it simplifies telemetry data handling, and get practical tips for implementation.

Read Post

Last9

Read more about OpenTelemetry Protocol (OTLP): A Comprehensive Guide to Modern Observability.

Streaming Aggregation: Real-Time Data Processing in 2024

Sep 11, 2024 By Anjali Udasi In Last9

We break down the essentials of streaming aggregation and its impact on modern data processing.

Read Post

Last9

Read more about Streaming Aggregation: Real-Time Data Processing in 2024

How to deploy a Slack bot to allow anyone in your team to quickly raise major incidents on Zenduty

Sep 9, 2024 By Vishwa Krishnakumar In Zenduty

One of the biggest challenges for some of our customers was allowing non-engineering teams, such as Support, Sales, or Sustomer Success teams, to raise incidents for specific Dev/Infra/Security/Ops teams on Zenduty in a structured and efficient manner as soon as a customer reports an issue. In many organizations, we observed that non-technical team members often needed to switch between platforms, fill out complex forms, or reach out to multiple stakeholders manually to ensure that an issue is escalated.

Read Post