%term

The latest News and Information on Service Reliability Engineering and related technologies.

Software Monitoring - Stuck in the 00s

Mar 8, 2024 By Piyush Verma In Last9

A short history of software monitoring, from the 00s. What has changed? Why are things so arcane?

Read Post

Last9

Read more about Software Monitoring - Stuck in the 00s

5 Easy Ways to Reduce Work-Related Stress for SRE Professionals

Mar 6, 2024 By Tiffany Cox In Rootly

It's completely normal to feel a little overwhelmed and stressed out at work these days. Technology has collaboration moving at the speed of light, and time away from screens is at an all-time low, blurring the lines between work and personal time. Plus, it's hard to ignore the multitude of tech outages that have been making headlines lately, leaving teams anxiously on edge. When you are a professional with on-call cycles, the potential of outages adds another level of complexity to the mix.

Read Post

Rootly

Read more about 5 Easy Ways to Reduce Work-Related Stress for SRE Professionals

The Role of APM in DevOps and SRE Practices

Mar 5, 2024 By Keren Feldsher In Coralogix

As the software development world becomes faster, enterprises must adapt to customer demands by increasing their application’s deployment frequency. They often rely on DevOps and Site Reliability Engineering (SRE) methodologies to achieve this. These approaches ensure high system availability amidst frequent deployments and prioritize delivering a seamless user experience.

Read Post

Coralogix

Read more about The Role of APM in DevOps and SRE Practices

How do you build resilient systems to manage the IPL with 30+ million concurrent users?

Mar 1, 2024 By Last9 In Last9

The Indian Premier League is a unique sporting event for a dozen reasons. But for engineers in India, it’s one of a kind. Very few companies can boast of managing 30+ million concurrent users. Every year, this number grows. Last year, we witnessed ~60 million concurrent users. And things get bigger and larger every year.

View Video

Last9

Read more about How do you build resilient systems to manage the IPL with 30+ million concurrent users?

Navigating the Evolving Landscape: A Deep Dive into REST API Versioning Strategies

Feb 29, 2024 By Vishal Padghan In Squadcast

In the ever-evolving landscape of APIs, ensuring seamless interactions and managing changes becomes crucial. While innovation and adaptability are essential, maintaining backward compatibility is equally important to avoid disruption for existing users. This is where REST API versioning comes into play. Versioning allows you to introduce new features or changes to your API in a controlled manner, while simultaneously keeping older versions running smoothly.

Read Post

Squadcast

Read more about Navigating the Evolving Landscape: A Deep Dive into REST API Versioning Strategies

Balancing Innovation and Reliability: A Guide for SRE Teams

Feb 28, 2024 By Vishal Padghan In Squadcast

In today's rapidly evolving technological landscape, striking a balance between innovation and reliability is a constant challenge for Site Reliability Engineering (SRE) teams. On one hand, businesses and customers crave the constant stream of new features and functionalities that fuel progress. On the other hand, ensuring system stability, minimal downtime, and optimal performance remains paramount for user experience and business continuity.

Read Post

Squadcast

Read more about Balancing Innovation and Reliability: A Guide for SRE Teams

Best Practices For Building A Resilient On-Call Framework

Feb 27, 2024 By Chitra Bisht In Squadcast

Whether a business is small scale, medium-sized, or a large enterprise, downtime issues can affect any organization as no business is exempt from experiencing downtime. However, the swifter the acknowledgment of an issue, the quicker the response, resulting in a reduced impact on business. An effective On-Call framework not only aids in prompt issue resolution but also plays a vital role in minimizing the overall downtime impact on business operations.

Read Post

Squadcast

Read more about Best Practices For Building A Resilient On-Call Framework

The 6 Best Incident Management Software in 2024

Feb 27, 2024 By Abhishek Sony In Squadcast

When the siren blares and your IT infrastructure is under siege, panic can be your worst enemy. In the heat of these digital battles, robust incident management software becomes your indispensable weapon. Forget fumbling through spreadsheets and frantic Slack threads - you need a clear-headed commander-in-chief, a champion of incident response who orchestrates your team to victory.

Read Post

Squadcast

Read more about The 6 Best Incident Management Software in 2024

Streamlining Incident Management With Squadcast and ServiceNow Bidirectional Integration

Feb 27, 2024 By Squadcast In Squadcast

Revisit our insightful webinar to explore how Squadcast’s latest bidirectional integration with ServiceNow can make the best of your ServiceNow implementation. Discover this powerful bidirectional integration's key features and benefits, designed to streamline incident resolution and enhance collaboration within your DevOps and IT teams. Learn, share, and grow with us as we journey towards a more reliable and efficient digital world..

View Video

Squadcast

Read more about Streamlining Incident Management With Squadcast and ServiceNow Bidirectional Integration

Incident Commander Training Strategies: What The Books Don't Tell You

Feb 26, 2024 By Zhuang (Strong) Liang In Rootly

It has been lightly revised and reposted with his permission from the original article on Medium. So, you’re training incident commanders (IC), and you have your group read Google’s SRE books. Everyone knows what they are supposed to do and you are ready for any incident, right? Not quite. Half of your team complains that the descriptions are too vague or don’t apply to their situations, and the other half just starts to improvise. The result?

Read Post