Operations | Monitoring | ITSM | DevOps | Cloud

How to reach 99.99% uptime: High Availability in Practice.

With most businesses finding it hard to achieve a 99.9% uptime throughout the year, achieving a goal of 99.999% uptime looks daunting to developers. Here’s how to reach 99.99% uptime for your business. It’s like asking someone to build a bridge that would never collapse or a machine that would never break down no matter what. In short, it is a hard goal to achieve but yes it is achievable.

Hiteshwar shares his thoughts on being an SRE

Hiteshwar is an SRE based out of Mumbai, India. His area of specialization is in distributed systems. He works on Kubernetes, running his own custom clusters, maintaining them and creating tools to manage and monitor them. He likes to share his learnings by writing articles and blogs on Medium and Linkedin. He is an active speaker in meetups and developer groups and also teaches DevOps and SRE practices at learning centers.

Checklist for publishing a guest post to Fyipe.

Here’s a quick checklist to publish articles or guest posts on Fyipe Blog. We invite anyone to publish stories to any of our publications. If you wish to contribute. Please send an email to [email protected] with your draft article. Please make sure your draft article follows guidelines in this post. Here’s what all this means for you as a writer: Educate your readers and teach them something new. Cut all the fluff. Get to the point — fast. Do not waste their time.

How To Auto Generate SSL Certificates On The Fly

Customers can generate hosted status pages that display the status such as availability, response time and incidents of their services (websites, API, infrastructure) to their own clients or for internal use. All status pages are hosted and maintained by our care. Users point a custom domain to our DNS, and after seconds their page is ready. Hassle free.

Embracing Chaos With BigPanda's Root Cause Analysis Features

The ever-growing complexity, scale and pace of IT environments puts a huge burden on IT Ops, NOC, and DevOps teams, who are tasked with keeping these environments up and running. One of the biggest challenges is Root Cause Analysis (RCA). When something breaks, they need to determine what broke it, and they need to do it fast.

How to create an on-call schedule that doesn't suck.

A lot of tech companies struggle with creating an effective and efficient on-call schedule internally for their product and service, this results in much longer downtimes when something goes wrong. They often over-burden their team members with repeated on-call duty which results in team member fatigue. Here’s how to create an on-call schedule that your team might love.

LaaS (Language as a Service) With Duolingo

欢迎! [Huānyíng] In Mandarin, this means “welcome,” the first Chinese phrase I ever learned as a Mandarin Language Minor in college. It took me two weeks to understand the tonal variations, one week to memorize and properly execute the written stroke pattern, and another week to hone the ability to say it with confidence to my teacher (aka 老师 [Lǎoshī]).