Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Alerting of Service Technicians in Facility Management

In buildings today, there are numerous systems that require regular maintenance or that need attention as quickly as possible if problems are detected. This applies, for example, to heating systems, air conditioning, cooling, ventilation, elevators or fire alarm systems. Modern facility management systems are able to reliably monitor such systems.

Incident triage: a key element in your MTTR

One of the key performance indicators for IT Ops is MTTR (Mean-Time-To-Resolution). MTTR essentially measures the length of your incident management lifecycle: from detection; through assignment, triage and investigation; to remediation and resolution. IT Ops teams strive to shorten their incident management lifecycle and lower their MTTR, to meet their SLAs and maintain healthy infrastructures and services. But that’s often easier said than done.

The Cost of IT Downtime: An Overview

As the adoption of cloud computing continues to encourage innovation across industries, high-performing and resilient systems have become a necessity in order to keep pace with the competition and meet internal/external SLAs (service level agreements). In terms of customer expectations, a minute of downtime can mean thousands of dollars in lost opportunity and a soiled customer relationship. So what exactly is downtime?

Using Remote Actions to Create ServiceNow Incidents

Recently we have received a lot of requests for Enterprise Alert to not only alert on critical situations but to also take a proactive approach to initiate, record and track those situations through ITSM tools such as ServiceNow and BMC Remedy. This post will center around what happens when critical systems fail and tickets are not being created in ServiceNow due to a break in the workflow.

Key Learnings from the Facebook Status Page

Yesterday April 8th 2021 at around 22:00 UTC, Facebook experienced a major outage where Facebook, Messenger, WhatsApp web and Instagram were down, lasting for as much as 3 hours. This was reported at Facebook’s status page, which was a good example of how to communicate and incident.