IncidentHub

How To Monitor Public Status Pages of Cloud Providers - a Step-by-Step Approach

Sep 22, 2024 By Hrishikesh Barua In IncidentHub

Incident updates on the public status pages of your cloud providers are often the first indication that they might have an outage. Providers also post updates about upcoming and ongoing maintenance on their status pages. Thus, monitoring your cloud status pages becomes crucial to your business operations. This article will guide you through the process of effectively monitoring such status pages.

Read Post

IncidentHub

Read more about How To Monitor Public Status Pages of Cloud Providers - a Step-by-Step Approach

Integrate Incident Alerts With Discord Using Webhooks

Sep 19, 2024 By Hrishikesh Barua In IncidentHub

Staying on top of your third-party Cloud and SaaS service outages is crucial to maintain the reliability of your own applications. If Discord is your communication tool of choice, you can keep up with such incidents by pushing these events to a Discord channel. Discord webhooks allow external applications to send messages to specific channels within a Discord server. This article describes how to integrate Discord as a channel in your IncidentHub account using webhooks.

Read Post

IncidentHub

Read more about Integrate Incident Alerts With Discord Using Webhooks

A Step by Step Guide to Checking if a SaaS is Down

Sep 17, 2024 By Hrishikesh Barua In IncidentHub

Modern businesses depend heavily on Software as a Service (SaaS). Almost all aspects of business operations - accounting, HR, payroll, marketing, IT, sales, support - depend on one or more SaaS applications. SaaS is not limited to being used by software development teams. Given this dependency on SaaS applications, their uptime becomes tightly tied to a business's uptime. Any SaaS downtime can affect both a business's daily operations as well as the user experience.

Read Post

IncidentHub

Read more about A Step by Step Guide to Checking if a SaaS is Down

When Alerts Don't Mean Downtime - Preventing SRE Fatigue

Sep 12, 2024 By Hrishikesh Barua In IncidentHub

A recent question in an SRE forum triggered this train of thought. I've paraphrased the question to reflect its essence. There is plenty to unravel here. My first reaction to this question was that the SRE who posted this is in a difficult place with systemic issues.

Read Post

IncidentHub

Read more about When Alerts Don't Mean Downtime - Preventing SRE Fatigue

Incident Archaeology - Dig Into Your Services' Past With IncidentHub's Availability Page

Aug 15, 2024 By Hrishikesh Barua In IncidentHub

A few weeks ago we released a feature on IncidentHub which gives you a historical view of your monitored services' availability.

Read Post

IncidentHub

Read more about Incident Archaeology - Dig Into Your Services' Past With IncidentHub's Availability Page

Monitoring Specific Components and Regions in Your Third-Party Services

Aug 12, 2024 By Hrishikesh Barua In IncidentHub

Chances are, most of your third-party cloud and SaaS dependencies are globally distributed and have many regions of operation. Chances are, your applications use a subset of a cloud or SaaS service. If you are monitoring such a service, why should you receive alerts for all regions or every single component in the service? E.g. if you use Digital Ocean, you might be using Kubernetes in their US locations (NYC and SFO). You would want to know only when there is an outage in one of these locations.

Read Post

IncidentHub

Read more about Monitoring Specific Components and Regions in Your Third-Party Services

Integrate Your Monitoring System With PagerDuty Using Events API V2

Aug 3, 2024 By Hrishikesh Barua In IncidentHub

PagerDuty's Events API V2 lets you push events from your monitoring systems to PagerDuty. You can push such events when there is a triggered, updated, or resolved incident.

Read Post

IncidentHub

Read more about Integrate Your Monitoring System With PagerDuty Using Events API V2

Monitoring Third Party Vendors as an Ops Engineer/SRE

Jul 22, 2024 By Hrishikesh Barua In IncidentHub

Why should you monitor your third-party Cloud and SaaS vendors if you are in SRE/Ops? As part of an SRE team, your primary responsibility is ensuring the reliability of your applications. What makes you responsible for monitoring services that you don't even manage? Third-party services are just like yours - with SLAs. And outages happen, affecting you as well as many others who depend on them.

Read Post

IncidentHub

Read more about Monitoring Third Party Vendors as an Ops Engineer/SRE

The Benefits of a Single Incident Management System

Jun 4, 2024 By Hrishikesh Barua In IncidentHub

How many monitoring tools do you have? Chances are at least 2-3. One tool usually does not cover all cases, and it’s usually a combination of self-managed and managed tools. Self-managed gives you more control over custom configurations and cost. Managed ones take away the headache of running it yourself. Prometheus is the de-facto standard for monitoring these days if you have a modern application stack and you want to manage your own monitoring.

Read Post

IncidentHub

Read more about The Benefits of a Single Incident Management System

Monitoring Your Third-Party Cloud and SaaS Services is Critical

May 20, 2024 By Hrishikesh Barua In IncidentHub

If you have a software-based business, you are using at least a few cloud based tools. It does not matter if you are a solo developer, or part of a 50-member team in a large organization. Take this random list and chances are you are using at least half of them: Your entire business - irrespective of org or market size - including your development tools, collaboration/communication tools, infrastructure and hosting, monitoring, even email - is dependent on services that you don’t control.

Read Post

IncidentHub

Read more about Monitoring Your Third-Party Cloud and SaaS Services is Critical

Operations | Monitoring | ITSM | DevOps | Cloud

IncidentHub

How To Monitor Public Status Pages of Cloud Providers - a Step-by-Step Approach

Integrate Incident Alerts With Discord Using Webhooks

A Step by Step Guide to Checking if a SaaS is Down

When Alerts Don't Mean Downtime - Preventing SRE Fatigue

Incident Archaeology - Dig Into Your Services' Past With IncidentHub's Availability Page

Monitoring Specific Components and Regions in Your Third-Party Services

Integrate Your Monitoring System With PagerDuty Using Events API V2

Monitoring Third Party Vendors as an Ops Engineer/SRE

The Benefits of a Single Incident Management System

Monitoring Your Third-Party Cloud and SaaS Services is Critical

Monthly Archive

Follow Us