Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

ITIL®4 - Incident Management as a practice in building customer loyalty

Every incident is a moment of truth that will make or break the image of a service organization. The Incident management is no longer considered a mere resolution process. It should be used to create customer experience and translate it into a great value proposition. In this webinar you will learn how Incident Management practice will have a positive impact on your operations.

Kubernetes Operators for Automated SRE

It can be quite challenging for an SRE team to maintain the well-being of a large-scale Kubernetes based system with hundreds or thousands of services. In this blog post, Gigi Sayfan, author of “Mastering Kubernetes”, outlines the SRE challenge and how we can achieve the ultimate goal of automated SRE with Kubernetes operators.

Release Notes: Stakeholder Engagement, Uptime Monitoring API, Flexible Periods for Schedules, and more

Nowadays, a working digital infrastructure is the lifeblood of almost any organization. The impact of a major IT incident can go far beyond the IT department, affecting a company’s revenue or incur costs in other areas of the business caused by service disruption. Therefore, in addition to the technical response to a major incident from the IT department, business stakeholders need to be involved as well, so they can prepare the business response.

How to Add Incident Alert Management to Your DevOps Pipeline

DevOps pipelines enable teams to implement continuous software development processes, often by using automation and collaboration tooling. The overall goal is to quickly release software products, updates, and fixes. To ensure a DevOps pipeline works well, teams add management and monitoring tooling to the pipeline. This includes incident alert management, which supports the team’s efforts in monitoring the security of various software and environment components.

Spring 2020 Launch: New Capabilities for a New Digital Era

The ongoing pandemic and resulting economic downturn have led to dramatically changing market conditions. As a consequence, technology teams have become increasingly concerned with the need to minimize their financial risk and reduce costs to mitigate the effects of abruptly pivoting to a fully remote working environment. For some, there has been a struggle to maintain business continuity—i.e., keeping the physical components of the business running when everyone is working from home.