Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Sponsored Post

How important is Observability for SRE?

Observability is what defines a strong SRE team. In this blog, we have covered the importance of observability, and how SREs can leverage it to enhance their business. Observability is the practice of assessing a system's internal state by observing its external outputs. Through instrumentation, systems can provide telemetry such as metrics, traces, and logs that help organizations better understand, debug, maintain and evolve their platforms.

Rundeck + Squadcast Integration: Simplifying Alert Routing

Rundeck is an automation tool that helps to make existing automation, scripts, and commands more secure, auditable, and easier to run. It is a software Job scheduler and Run Book Automation system that automates routine processes across development and production environments. It brings together tasks scheduling, multi-node command execution, workflow orchestration. It also logs everything that happens in the system. Squadcast is an end-to-end incident response tool.

SREcon 2022 Americas Wrap Up

Hi everyone! We had a fantastic time at SREcon 2022 Americas last week, and I thought I’d share our stories and experiences. As the SRE community grows and evolves, these chances for collaboration become more and more important… and fun! Although I only attended virtually, I could still feel an exciting atmosphere as great minds came together.

SolarWinds Orion + Squadcast: Alert Routing Made Easy

SolarWinds Orion is a scalable infrastructure monitoring and management platform. It is designed to simplify IT administration for on-premises, hybrid, and software as a service (SaaS) environments, in a single pane of glass. SolarWinds Orion ensures you do not have to struggle with numerous incompatible point monitoring products, as it consolidates the full suite of monitoring capabilities into one platform with cross-stack integrated functionality. Squadcast is an end-to-end incident response tool.

What Is Site Reliability Engineering (SRE)? The SRE Role Explained

Historically, there was a clear delineation between what system administrators (SysAdmins) do and what application developers are responsible for in IT organizations. In recent years—especially in organizations focused on software development—these worlds have come together as IT operations and development teams adopt DevOps practices. The concept of site reliability engineering (SRE) was first introduced by a much-discussed book titled Site Reliability Engineering from Google.

Honeycomb + Squadcast Integration: Routing Incident Alerts Made Easy

Honeycomb is an application monitoring tool that helps DevOps and SRE teams to operate more efficiently by offering rich observability solutions and intuitive team collaboration. It helps understand complex relationships within your distributed systems and troubleshoot issues accordingly. Squadcast is an end-to-end incident response tool. Built with an SRE mindset, it streamlines all the incident response activities.

SRE Metrics: Four Golden Signals of Monitoring

SRE (site reliability engineering) is a discipline used by software engineering and IT teams to proactively build and maintain more reliable services. SRE is a functional way to apply software development solutions to IT operations problems. From IT monitoring to software delivery to incident response – site reliability engineers are focused on building and monitoring anything in production that improves service resiliency without harming development speed.

DevOps vs SRE - Reducing Technical Debt and Increasing Efficiency and Resiliency

One more blog topic stemming from our weekly office hours that we hold with the field team here at Shipa. In our last office hours, was asked a question about “what are the difference between DevOps Engineers and SREs?”. Both professions are emerging disciplines and cultures that continue to evolve and play an importance in technology organizations. I’ve been fortunate to have written and spoken about this before; though taking a fresh look at what the two domains try to accomplish.