Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

AIOps and Dell's latest acquisition

Dell’s recent acquisition of Moogsoft is the most recent validation of the growing market for automated ITOps – also known as AIOps. When legacy companies such as Dell recognize the importance of AIOps it proves the technologies behind automating ITOps are now mainstream and a vital part of every modern IT management stack. We look forward to seeing how Moogsoft’s integration into Dell will play out over the coming years.

In review: Gartner Hype Cycle for Monitoring and Observability

You know it’s going to be a great day when you find yourself mentioned as a sample vendor on the well-read Gartner’s Hype Cycle report. The OnPage team is thrilled to share with its community that we have been mentioned as a sample vendor by Gartner on their latest Hype Cycle for Monitoring and Observability. Continuing its impressive streak of mentions this year, OnPage is featured as a sample vendor, specifically within the Automated Incident Response category.

Latest Developments in Monitoring and Observability, 2023

You know it’s going to be a great day when you find yourself mentioned as a Sample Vendor on the Gartner® Hype Cycle™ report for Monitoring and Observability, 2023(July 2023). The OnPage team is thrilled to share with its community that we have been mentioned as a Sample Vendor by Gartner on their latest Hype Cycle for Monitoring and Observability. OnPage is recognized as a Sample Vendor, specifically within the Automated Incident Response category.

Custom fields: make FireHydrant your personalized incident management platform

Today we're releasing custom fields, a powerful new feature that empowers you to tailor FireHydrant to your organization's specific needs and capture essential incident details. Custom fields help you track critical states, involved parties, resolution specifics, affected services, messages, and more — almost anything you want! — all aligned with your unique workflows. Regardless of the size of your team or the maturity of your processes, custom fields adapt to your workflow.

210% ROI: unlocking the economic value of FireHydrant for incident management

In the fast-paced high-tech industry, efficient incident management is a critical factor in maintaining brand reputation, employee morale, and most importantly, your bottom line. Good practices can result in reduced downtime, increased learning opportunities from incidents, and an enhanced reputation among both the engineering community and customers. But quantifying the true cost of incidents has always been a challenge — until now.

Datadog and BigPanda: Observability and AIOps made better together

Datadog’s modern observability empowers development engineers with full-stack visibility, comprehensive instrumentation generation, and proactive alerts to accelerate software development releases and address potential incidents. While Datadog gives teams end-to-end visibility, it works even better together with AIOps from BigPanda – development teams gain insights into outside application dependencies and reliance on other systems.

10 Years of Failure Friday at PagerDuty: Fostering Resilience, Learning and Reliability

In today’s fast-paced and ever-evolving world of technology, failure is inevitable. Organizations should embrace failure as a learning opportunity for how to build and deliver more resilient services. At PagerDuty, we’ve practiced Failure Friday for 10 years now. Failure Friday–a practice inspired by the chaos engineering space–involves intentionally injecting failures into our systems to improve reliability and foster a proactive engineering culture.

The Unplanned Show, Episode 6: Defining AIOps with Heather Newburn

“AIOps” is a term some love to hate, but what makes it useful? In this episode, Heath Newburn breaks down the three things to look for in an AIOps solution: reduce noise, create context, and reduce toil. He also explains the challenges with domain-specific approaches, versus domain-agnostic approaches to AIOps. But even within that approach, Heath warns of “gotchas” in rules “tech debt”, data formats, and overall long implementation times.

We used GPT-4 during a hackathon-here's what we learned

We recently ran our first hackathon in quite some time. Over two days, our team collaborated in groups on various topics. By the end of it, we had 12 demos to share with the rest of the team. These ranged from improvements in debugging HTTP request responses to the delightful “automatic swag sharer.” Within our groups, a number of us tried integrating with OpenAI’s GPT to see what smarts we could bring to our product.