Latest Posts

What you can show on your status page

Jan 14, 2020 By Squadcast In Squadcast

When something goes down, the first thing a customer does is check if there is something wrong with their systems or if it is an issue with one of their service providers. So it’s important to make sure that your status page has all the information that is needed where they don’t feel the need to raise an issue or create a ticket, adding to your support costs.

Read Post

Squadcast

Read more about What you can show on your status page

Using a Status Page in your Incident response process

Jan 10, 2020 By Prakya Vasudevan In Squadcast

A status page is a communication tool that allows you to display the current working status of your various services - whether fully functional, partially degraded, severely affected, etc. The nomenclature of the service status can be defined by you. On the status page, you can also access & update the uptime and incident history data for all your internal facing or customer impacting components.

Read Post

Squadcast

Read more about Using a Status Page in your Incident response process

Reducing On-call Alert Fatigue with Deduplication

Jan 8, 2020 By Prakya Vasudevan In Squadcast

Alert noise is a very common on-call complaint leading to fatigue and on-call burnout. This article is an attempt at helping folks address this problem.

Read Post

Squadcast

Read more about Reducing On-call Alert Fatigue with Deduplication

Squadcast's Year in Review, 2019

Dec 31, 2019 By Prakya Vasudevan In Squadcast

We’re heading into 2020 with a platform full of features and a heart full of happiness! It’s the end of a decade and this year has been nothing short of great for us! 2019 gave us an accelerated product growth and our team grew by 2x in size. We kick-started this year with a complete UI refresh and a whole bunch of new features. We also sponsored some of the major tech events and conducted our first ever community driven meetup!

Read Post

Squadcast

Read more about Squadcast's Year in Review, 2019

How to avoid on-call burnout

Dec 20, 2019 By Prakya Vasudevan In Squadcast

It sucks to be on-call when processes are not well defined and streamlined. Especially around the holidays. You really don't want to hear your phone repeatedly going off right when you're sitting for Christmas dinner with your loved ones or getting to unwrapping the good presents (the ones with the sparkly wrapping paper :P). Your on-call team’s stress levels reflects the health of your system, the cleanliness of your code and the culture of your organization.

Read Post

Squadcast

Read more about How to avoid on-call burnout

Transparency in Incident Response

Dec 16, 2019 By Squadcast In Squadcast

When your production systems are hit with a critical issue, you can trust your DevOps team, your Sysadmins or your SREs to get the system back on track. This is a no brainer. And in turn, these folks need to be able to trust the rest of the team to let them do their jobs, be it engineering, customer support or product management. But where does this trust come from? It comes from understanding - the more you understand, the more you can trust.

Read Post

Squadcast

Read more about Transparency in Incident Response

Danny Mican on his experience as an SRE at Auth0

Dec 2, 2019 By Prakya Vasudevan In Squadcast

Danny is an SRE at Auth0 and currently manages the reliability of systems that authenticate over 2.5 billion logins per month and is expected to have 99.9% (Three Nines) availability. He loves learning about systems and making changes that positively impact client happiness, employee happiness and long term stability and growth.

Read Post

Squadcast

Read more about Danny Mican on his experience as an SRE at Auth0

The Age of Service Mesh

Nov 28, 2019 By Gigi Sayfan In Squadcast

You have built a massively successful system. The users just can't get enough and request new features. Your developers crank out new services on a regular basis. Your DevOps/SRE team configures and scale your Kubernetes cluster (or clusters). As the system becomes more complicated and sophisticated you realize that there are common themes that repeat across all your services.

Read Post

Squadcast

Read more about The Age of Service Mesh

Pavlos Ratis shares his experience on being an SRE

Nov 13, 2019 By Prakya Vasudevan In Squadcast

Pavlos is a Site Reliability Engineer based in Munich, Germany. He likes building software and expanding his knowledge around the reliability of services and their infrastructure. He has created a few open-source SRE projects such as the awesome-sre, Wheel of Misfortune, Availability Calculator, and awesome-chaos-engineering to assist teams and individuals in getting on board with the SRE culture.

Read Post

Squadcast

Read more about Pavlos Ratis shares his experience on being an SRE

Automated Runbooks = Faster Recovery

Nov 11, 2019 By Shreyash Naithani In Squadcast

Traditional Runbooks can become 10x more useful if they were automated or at least made executable (partly, if not fully). Shreyash Naithani from Microsoft Azure SRE team and author of "Practical Site Reliability Engineering" talks about how to take advantage of runbooks to eliminate toil.

Read Post

Squadcast

Read more about Automated Runbooks = Faster Recovery

Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

What you can show on your status page

Using a Status Page in your Incident response process

Reducing On-call Alert Fatigue with Deduplication

Squadcast's Year in Review, 2019

How to avoid on-call burnout

Transparency in Incident Response

Danny Mican on his experience as an SRE at Auth0

The Age of Service Mesh

Pavlos Ratis shares his experience on being an SRE

Automated Runbooks = Faster Recovery

Monthly Archive

Follow Us