Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Shhh... we have Private Incidents

We’re excited to announce that private incidents are now available on FireHydrant. For the first time, incidents can have visibility limited to only permissioned users are able to see. This is a great solution for security and compliance teams who need to collaborate with their engineering counterparts to resolve incidents. The nature of these incidents that these teams work on dramatically differs from operational incidents.

Uncovering the Importance of Mean Time Between Failures

In the IT world, application service providers (ASPs) build customer trust by ensuring the continuous, uninterrupted availability of their services and software. Service availability allows customers to operate normally and generate revenue without being directly impacted by their providers’ system failures. Though providers work to ensure system uptime, they are often challenged by unexpected technical issues that impact customer-facing systems.

Monthly Moo Update | December 2021

What a year 2021 has been for us all. We are extremely proud of the continuous innovation and delivery of new features and functionality we have provided throughout the year, all while maintaining enterprise scale and uptime that could win awards. We’ve heard success story after success story from our brilliant customers, each unique in their own way. We couldn’t have had the successful year we’ve had without you, and it’s been our honor to be part of your success.

BigPanda's ServiceNow integration just got better

ServiceNow is widely used across Fortune 1000 and Global 5000 enterprises, so it’s no wonder that the majority of BigPanda customers use ServiceNow and integrate with it to streamline their ticketing requests. BigPanda’s AIOps Event Correlation and Automation Platform provides context-rich incidents to IT Ops teams relying on ServiceNow and helps them gain end-to-end real-time visibility into their operations.

What we learned from AWS's us-east-1 outage

In case you missed it, for several hours on December 7, 2021, AWS's us-east-1 region had an outage impacting multiple AWS APIs, taking out various websites across the internet. According to our own monitoring at OnlineOrNot, the outage started at 2021-12-07 15:32 UTC and began to recover well at 2021-12-07 22:48 UTC (with minor signs of life for a few minutes around 2021-12-07 20:08 UTC). Had we relied solely on AWS to update their status page before reacting, we would have been waiting a while.