Zenduty

MTBF, MTTR, MTTF, MTTA: Incident Metrics Explained

Jun 26, 2024 By Anjali Udasi In Zenduty

When it comes to managing incidents and ensuring operational efficiency, understanding key metrics is crucial. Among the most important are MTBF (Mean Time Between Failures), MTTR (Mean Time To Repair), MTTF (Mean Time To Failure), and MTTA (Mean Time To Acknowledge). In this blog, we'll explore these metrics along with some best practices and practical applications.

Read Post

Zenduty

Read more about MTBF, MTTR, MTTF, MTTA: Incident Metrics Explained

The Science of Building Cloud Native DevTools - Incidentally Reliable with Ramiro Berrelleza

Jun 21, 2024 By Zenduty In Zenduty

Catch Ramiro Berrelleza — Founder and CEO at Okteto talk about how impactful DevTool startups are built, the importance of investing in Developer Experience, and the emerging issues with the Cloud Native ecosystem.

View Video

Zenduty

Read more about The Science of Building Cloud Native DevTools - Incidentally Reliable with Ramiro Berrelleza

Four Golden Signals: Key Indicators for System Reliability

Jun 3, 2024 By Anjali Udasi In Zenduty

System reliability is crucial for providing seamless user experiences and enabling effective business operations. The "4 Golden Signals" —latency, traffic, errors, and saturation—offer a comprehensive view of system performance and potential issues. In this blog, we deep dive into system reliability and explore these four key metrics for monitoring system health and ensuring optimal performance.

Read Post

Zenduty

Read more about Four Golden Signals: Key Indicators for System Reliability

Credit-Worthy Reliability - Incidentally Reliable with Krishnendu Majumdar

May 30, 2024 By Zenduty In Zenduty

Catch Krishnendu Majumdar (CPTO at Yubi) talk about his journey in the dynamic Indian startup ecosystem, strategies to build for scale from Day 1 and insights into building sustained user trust via exceptional product performance in high governance industries like credit and finance.

View Video

Zenduty

Read more about Credit-Worthy Reliability - Incidentally Reliable with Krishnendu Majumdar

The Reliability Stories You Won't Hear on LinkedIn

May 24, 2024 By Anjali Udasi In Zenduty

We had the pleasure of meeting Ponmani Palanisamy, a Staff Site Reliability Engineer at LinkedIn, at a recent SRE Meetup in Bangalore. Ponmani gave an insightful talk on "Improving data redundancy and rebalancing data in HDFS." We were captivated by his talk and eager to learn more about his experience in the reliability space. We talked about everything including his journey, experiences, and of course, his most memorable war room stories over a steady career of 17 years. Here's what he had to share.

Read Post

Zenduty

Read more about The Reliability Stories You Won't Hear on LinkedIn

KPI vs. SLA: Important Metrics in Incident Management

May 20, 2024 By Anjali Udasi In Zenduty

Organizations prioritize Key Performance Indicators (KPIs) and Service Level Agreements (SLAs) to achieve optimal performance. However, understanding the differences between KPIs and SLAs can be challenging. In this blog, we discuss everything about Key Performance Indicators (KPIs), Service Level Agreements (SLAs), and the key differences between KPIs vs SLAs.

Read Post

Zenduty

Read more about KPI vs. SLA: Important Metrics in Incident Management

Reliability for the Books - Incidentally Reliable with Niall Murphy

May 10, 2024 By Zenduty In Zenduty

Catch Niall Murphy (Co-Founder of Stanza Systems) talk about graceful degradation, what startups are getting wrong about reliability and how well-thought user-experiences can communicate credibility to current and potential customers. Exclusively on The Incidentally Reliable podcast — made by SREs for SREs, hosted by Zenduty.

View Video

Zenduty

Read more about Reliability for the Books - Incidentally Reliable with Niall Murphy

What are some startups Solomon Hykes is rooting for?

May 7, 2024 By Zenduty In Zenduty

What are some startups Solomon Hykes is rooting for? What's his most controversial opinion? Who are some community members that more people should follow? Discover the answers to these questions, and a lot more in the Incidentally Reliable Podcast with Solomon Hykes, live on all major platforms! Tune in as Solomon shares stories from the early days of Docker, Inc, the rollercoaster journey leading to 20 million active developers worldwide, the heavy crown of a tech leader and his vision to revolutionize CI/CD with Dagger today.

View Video

Zenduty

Read more about What are some startups Solomon Hykes is rooting for?

Reinventing Deployments: From Docker to Dagger -- Incidentally Reliable with Solomon Hykes

Apr 30, 2024 By Zenduty In Zenduty

Catch Solomon Hykes (Co-founder of @Docker and @Dagger) shares stories from the early days of Docker, the rollercoaster journey leading to 20 million active developers worldwide, the heavy crown of a tech leader and his vision to revolutionize CI/CD with Dagger today. Exclusively on The Incidentally Reliable podcast — made by SREs for SREs, hosted by Zenduty.

View Video

Zenduty

Read more about Reinventing Deployments: From Docker to Dagger -- Incidentally Reliable with Solomon Hykes

Insights of an Observability Advocate: The Challenges and Rewards

Apr 28, 2024 By Anjali Udasi In Zenduty

At a recent SRE Meetup in Bangalore, we had the pleasure of meeting Akshay Deshpande. During our conversation, Akshay, who manages a Performance/Observability Engineering team at Smarsh discussed his passion for observability and his constant drive to improve the field. Smarsh helps companies gain valuable insights from their communication data, enabling them to proactively identify potential regulatory and reputational risks before they escalate.

Read Post

Zenduty

Read more about Insights of an Observability Advocate: The Challenges and Rewards

Operations | Monitoring | ITSM | DevOps | Cloud

Zenduty

MTBF, MTTR, MTTF, MTTA: Incident Metrics Explained

The Science of Building Cloud Native DevTools - Incidentally Reliable with Ramiro Berrelleza

Four Golden Signals: Key Indicators for System Reliability

Credit-Worthy Reliability - Incidentally Reliable with Krishnendu Majumdar

The Reliability Stories You Won't Hear on LinkedIn

KPI vs. SLA: Important Metrics in Incident Management

Reliability for the Books - Incidentally Reliable with Niall Murphy

What are some startups Solomon Hykes is rooting for?

Reinventing Deployments: From Docker to Dagger -- Incidentally Reliable with Solomon Hykes

Insights of an Observability Advocate: The Challenges and Rewards

Monthly Archive

Follow Us