Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Using observability tools to set SLOs for Kubernetes Applications

You deployed a service to your Kubernetes cluster. How do you it is working as expected? In this blog, Gigi Sayfan, author of “Mastering Kubernetes” talks about Kubernetes observability tools like Prometheus, Grafana and Jaeger, how to utilize them to set proper SLOs and make sure the service meets its objectives.

Optimizing your alerts to reduce Alert Noise

Reducing alert fatigue starts from your monitoring platform - setting the right thresholds to trigger alerts and understanding which of these are essential to be sent into your on-call platform is a start. This post outlines some of the best practices that help you reduce alert noise and improve your on-call experience. The word noise implies something unpleasant and unwanted. You combine that with on-call and it adds a factor of annoyance to the already overwhelming process.

Leverage JIRA with Squadcast throughout the incident lifecycle

Atlassian’s Jira is an issue and project tracking software that helps plan, track and manage projects. Jira is also used by customers and internal teams to log issue tickets for the product and engineering teams to look into and resolve. This forms a feedback loop between the customer-facing and product teams to help drive and deliver the best possible software. Jira is widely adopted by Agile development teams to customize workflows and embrace collaborative resolutions to ship good software fast.

Incident Response in the time of Remote Work

The unexpected and sudden shift to remote working introduces a new set of problems within the incident response space. And while each organization needs to take its own unique circumstances into account, this post outlines the best practices and steps that can be taken in the right direction in keeping operations both productive and proactive.

Top Monitoring Tools for DevOps Engineers and SREs

Monitoring has moved from a simple proactive practice to a necessity on any product launch checklist. It is crucial to pick a tool that meets your observability needs & ensures reliability of your service to your customers. Over the years, with an increase in adoption of DevOps and SRE practices, Monitoring has moved from a simple proactive practice to a necessity on any product launch checklist.

Succeeding With Service Level Objectives

In this blog, Danny Mican, a Senior Site Reliability Engineer, outlines how to implement SLOs from scratch using the IIDARR process. He also states it is extremely crucial for your SLOs to be actionable and is always following a feedback approach as it will play an important role in the debate of Features Vs Technical Debt.

Hrushikesh shares his journey into SRE and his thoughts on the future of this space

Hrushikesh is passionate about making a complex design with simple and reliable solutions. He is technology and platform agnostic and doesn’t believe in limiting himself to just a few. He started his career in 2006 with a Media company where he was responsible for introducing new technologies along with driving a team to deliver quickly. He does not limit his role to just development and operations and loves exploring everything in the tech space.