Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

The Debrief: A year in review-2023 at incident.io

What a year 2023 was at incident.io! While it's hard to summarize 365 days into just a few sentences, a handful of moments stood out from this transformative year: So as we close the curtain on a momentous 2023, we sat down with the three co-founders of incident.io—Chris, Stephen, and Pete—to do a bit of reflection on the wild ride that was this year.

All I want for Christmas... from Slack

When declaring and responding to an incident with incident.io, most of your interactions with our product will go via Slack. You might configure your forms in our web dashboard, but the responder using them to declare an incident is most likely doing so from a Slack modal, and the incident announcement will be posted as a Slack message. This means a lot of our product design falls within the constraints of what we can build using Slack’s block kit.

Understanding ServiceNow Incident Management: A comprehensive guide

You’re focused on swiftly identifying, analyzing, and resolving disruptions in IT services. And you know all too well that correctly deploying and adopting incident management holds the key to delivering a more reliable and responsive IT environment for your applications and services. That’s why you’re using or are considering using ServiceNow’s incident management to ensure a structured and efficient approach to handling your IT service incidents.

Better Incidents Winter Bonfire: Inside On-Call

Engineers are bombarded with pages left and right. There's uncertainty about how to escalate. A constant blur exists between what's urgent and what can wait. This never-ending ping-pong game takes a toll. Burnout creeps in, and your engineering culture has taken a nose dive before you know it.

LLM Monitoring and Observability

Large Language Models (LLMs) are advanced artificial intelligence models designed to comprehend and generate human-like language. With millions or even billions of [parameters, these models, like GPT-3, excel in natural language processing, understanding context, and generating coherent and contextually relevant text across various applications.

Lessons in Incident Response I Learned While Waiting Tables

Before I stumbled into the tech industry (a story for another day), I spent several years in the customer service world as a server and front-of-house manager in restaurants. It was in these jobs that I first honed some critical skills that would later lead me on the path to incident response.

Getting started with IT operations automation

Tech companies face a daunting challenge: a staggering 90% of their IT teams are stuck doing mundane, repetitive tasks, leaving only 10% to focus on strategic innovation. Companies know that automation is the solution to these repetitive, low-level incident response actions; however, many need support to begin automating.