Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Recapping this year's AWS re:Invent 2022

Amazon recently concluded their five-day long conference, AWS re:Invent 2022. This year’s conference was hybrid with the company streaming a significant portion of their in-person conference for free. For ten years now, the event has seen attendees across the cloud continuum come together to learn, share and get inspired. This year was no different as we saw some of the biggest names in cloud computing make their presence felt at the conference in Las Vegas.

Taking incident management to the next level with an internal developer portal

There is no denying that incident management is one of the most crucial processes concerning the service and business aspects of software deployment. Not having a robust system in place to address and remedy unfortunate incidents can lead to user dissatisfaction, which can ultimately take a toll on your business metrics. A suboptimal management system can also have adverse impacts internally if it prioritizes efficiency and speed of recovery to the point of neglecting employee well-being.

PagerDuty App for ServiceNow: Extend ITSM with Real-Time Digital Operations

Watch this demo to learn about extending your ITSM solution with Real-Time Digital Operations via the PagerDuty App for ServiceNow. You'll learn about what you will get out of the box and will see the integration in action.
Sponsored Post

Outages ITOps professionals are thankful to avoid

As we settle into the time of year when we reflect on what we're thankful for, we tend to focus on important basics such as health, family and friends. But on a professional level, IT operations (ITOps) practitioners are thankful to avoid disastrous outages that can cause confusion, frustration, lost revenue and damaged reputations. The very last thing ITOps, network operations center (NOC) or site reliability engineering (SRE) teams want while eating their turkey and enjoying time with family is to get paged about an outage. These can be extremely costly - $12,913 per minute, in fact, and up to $1.5 million per hour for larger organizations.

How to choose an incident management software

The ITIL definition of an incident is “an unplanned interruption to or a quality reduction of an IT service”. In your IT ecosystem, an incident may be caused due to a malfunctioning asset, or a network failure. Common incidents include issues with the printer, Wi-Fi connectivity, application locks, email service, laptop, file sharing, unresponsive servers, or even authentication errors.

Best practices for on-call scheduling and management

An on-call schedule forms the backbone of your incident response system in the event of an outage or when an issue is raised. This type of schedule does not keep end-users waiting and helps maintain the reliability and availability of your software. However, on-call management practices often induce worry and anxiety in team members. In extreme cases, it can even be a contributing factor in employee burnout.

5 tips for a more modern and efficient on-call management

‍ On-call management is one of the most important aspects of seamless IT service. Its aim is to ensure that the right person is notified in the case of an incident, so that they can react accordingly as quickly as possible. In certain cases, many people have to be notified. To achieve this as efficiently as possible, it is vital to have an up-to-date and smoothly functioning system.