Latest Posts

Creating a Better Incident Response Plan

May 10, 2021 By Biju Chacko In Squadcast

A few minutes of unexpected downtime can have catastrophic effects! Having a great incident response plan is more than a luxury - it is a necessity for organisations of all sizes today. This blog outlines key activities that can help you in formulating a better incidence plan.

Read Post

Squadcast

Read more about Creating a Better Incident Response Plan

Top SRE Toolchain Used By Site Reliability Engineers

May 7, 2021 By Biju Chacko In Squadcast

We have compiled a list of the most popular and sought out tools (some you may have heard of) that SREs need in their toolkit - at every phase of a production system to keep up with SRE best practices Site reliability engineering (SRE) practices help organizations by ensuring smooth functioning of their deliverables with utmost reliability and resilience. These can be achieved by a set of well-defined tools that are deployed at every phase of the production system to keep up with SRE best practices.

Read Post

Squadcast

Read more about Top SRE Toolchain Used By Site Reliability Engineers

Using Distributed Tracing in Microservices Architecture

May 6, 2021 By Biju Chacko In Squadcast

With the rise of microservices based cloud applications & its corresponding complexities, the need for observability is greater than ever. This blog looks into the what-why of distributed tracing along with few best practices to adopt for the same in microservices architecture. Distributed tracing for Microservices architecture is an emerging concept that is gaining momentum across internet-based business organizations.

Read Post

Squadcast

Read more about Using Distributed Tracing in Microservices Architecture

7 Ways SRE Is Changing IT Ops And How To Prepare For Those Changes

Apr 29, 2021 By Squadcast Community In Squadcast

SRE best practices are disrupting and catalyzing change in the ways organizations approach IT Operations. In this blog we look at 7 ways SRE is bringing this transition. ‍Site Reliability Engineering is a new practice that has been growing in popularity among many businesses. Also known as SRE, the new activity puts a premium on monitoring, tracking bugs, and creating systems and automations that solve the problem in the long term.

Read Post

Squadcast

Read more about 7 Ways SRE Is Changing IT Ops And How To Prepare For Those Changes

Reduce Toil with Better Alerting Systems

Apr 8, 2021 By Biju Chacko In Squadcast

If not tackled early, increasing toil can affect the morale and productivity of your SRE team. In this blog we look at some of the ways you can counter toil with the help of better alerting systems in place. Are you an SRE or On-call engineer struggling to manage toil? Toil is any repetitive or monotonous activity that can lead to frustration within an incident management team. Also at the business level, toil doesn't add any functional value towards growth and productivity.

Read Post

Squadcast

Read more about Reduce Toil with Better Alerting Systems

How to configure services in Squadcast: Best practices to reduce MTTR

Mar 31, 2021 By Biju Chacko In Squadcast

With a rise in digital platforms, IT infrastructure has grown exponentially complex to a level where multiple application interdependencies coexist with varied architecture & oncall team types. This blog looks at how you can model your infrastructure in Squadcast to reduce your time to respond & resolve incidents.

Read Post

Squadcast

Read more about How to configure services in Squadcast: Best practices to reduce MTTR

Overview of Incident Lifecycle in SRE

Feb 23, 2021 By Biju Chacko In Squadcast

Incidents that disrupt services are unavoidable. But every breakdown is an opportunity to learn & improve. Our latest blog is a deep dive into best practices to follow across the lifecycle of an incident, helping teams build a sustainable and reliable product - the SRE way As the saying goes, “Every problem we face is a blessing in disguise”.

Read Post

Squadcast

Read more about Overview of Incident Lifecycle in SRE

Error Budgets and their Dependencies

Feb 3, 2021 By Adam Hammond In Squadcast

Does your team struggle with not having balanced error budget, that impacts your reliabilty & pace of innovation? Adam Hammond in his latest blog talks about error budget - accountable for planned & unplanned outages that your systems may encounter & how teams can calculate error budget efficiently.

Read Post

Squadcast

Read more about Error Budgets and their Dependencies

7 Tips On Building And Maintaining An SRE Team In Your Company

Jan 22, 2021 By Squadcast Community In Squadcast

In today's "always on" world, Reliability is a primary business KPI. Plant the culture of Reliability by implementing these 7 simple tips to build a solid SRE team in your organization. Many of today’s hottest jobs didn’t exist at the turn of the millennium. Social media managers, data scientists, and growth hackers were never heard of before. Another relatively new job role in demand is that of a Site Reliability Engineer or SRE. The profession is quite new.

Read Post

Squadcast

Read more about 7 Tips On Building And Maintaining An SRE Team In Your Company

The Key Differences between SLI, SLO, and SLA in SRE

Jan 20, 2021 By Biju Chacko In Squadcast

To incentivize reliability in your platform, there should be shared goals across your team to measure & quantify the capabilities of your product/service along with customer experience. Define the path of "Always-On" services by understanding few key SRE fundamentals and their implications - SLIs, SLOs & SLA. Framing SRE metrics for building or scaling a product is quite a daunting task.

Read Post

Squadcast

Read more about The Key Differences between SLI, SLO, and SLA in SRE

Operations | Monitoring | ITSM | DevOps | Cloud

Creating a Better Incident Response Plan

Top SRE Toolchain Used By Site Reliability Engineers

Using Distributed Tracing in Microservices Architecture

7 Ways SRE Is Changing IT Ops And How To Prepare For Those Changes

Reduce Toil with Better Alerting Systems

How to configure services in Squadcast: Best practices to reduce MTTR

Overview of Incident Lifecycle in SRE

Error Budgets and their Dependencies

7 Tips On Building And Maintaining An SRE Team In Your Company

The Key Differences between SLI, SLO, and SLA in SRE

Monthly Archive

Follow Us