Operations | Monitoring | ITSM | DevOps | Cloud

Alerting

How StatusHub Complements and Extends Your Incident Management Process?

Although the main focus of StatusHub is incident communication, it compliments each 5 activities of Incident Management: Identification, Categorization, Prioritization, Response and Communication with the user community through the life of the incident.

Postmortems and Retrospectives (class SRE implements DevOps)

Even after a service has been restored, SREs still have a bit of work to do. In this video, Liz and Seth discuss the postmortem process that SREs follow. Blameless postmortems and retrospectives are key to learning from failures and preventing recurrence. You will learn about the importance of conducting a postmortem, strategies for conducting a blameless postmortem, and techniques for trending retrospectives across your entire organization to gain better insights to prevent service disruptions in the future.

Overrides, the Most Human Feature in PagerDuty

If you’ve ever been on call, you know that the incidents don’t stop because you have the flu. Or when you’re attending your child’s high school graduation. Or, as I found out firsthand, even when you’re at your own wedding. Confucius once said, “If you have never had a major occasion happen while you are on call, then you may not have ever lived.” (Okay, I totally made that one up.)

It's Time to Start Talking about Digital Operations

IT operations teams have some of the most stressful jobs in IT. Keeping data centers online, servers running, enterprise systems functioning, and applications performing — all while responding to incidents and requests is hard work. While there are monitoring systems in place to provide visibility and change management practices give IT some control over the network and environment, IT operations teams constantly feel like they are fighting a losing battle.

AlertOps Announces Playbook Automation Focusing on Critical Enterprise Needs in Fast-growing Incident Response Market

CHICAGO, Oct. 9, 2018 /PRNewswire/ — Illinois-based digital operations management and real-time collaboration platform AlertOps, announces a renewed focus on Enterprises in the IT Operations Management, DevOps, and SecOps spaces. CIOs and IT leaders need vendors that can merge technology and business scenarios to solve complex collaboration and communication problems.

OnPage to Provide Florida With Critical Communication for Hurricane Michael

OnPage the global leader in Incident Alert Management and mass notification aims to help people connect with each other and provide their communities and businesses with critical communication tools free-of-charge for one month during Hurricane Michael. Tropical storm Michael has become a hurricane and is moving towards the Gulf of Mexico and is still likely to hit Florida’s northern Gulf coast on Wednesday.

Disruption Detector and Real Time Monitoring with Stackdriver (Cloud Next '18)

Aja built an interactive disruption detector panel for attendees at the Google I/O Conference to intentionally cause errors to happen to the system. This demo highlights the amazing real time monitoring feature of Stackdriver as it tracks all incoming errors and make things easier for developers to pinpoint the issue. Watch the video to learn more.

Reduce Your IT Alert Noise and Fatigue

Two of the biggest IT headaches we hear about here at Zenoss are alert noise and fatigue. To help combat these, we co-developed an integration with PagerDuty to help enterprises successfully overcome these pain points. Recently, we launched the updated version of the PagerDuty ZenPack using PagerDuty’s v2 API and certification for Zenoss Cloud.

OpsRamp Webinar - OpsRamp + #ITSM: Incident Management For Superior Digital Performance

Manage your incident lifecycle with actionable insights so that you can prevent IT outages and reduce downtime. Proactive Monitoring. Drive system health, availability, and performance with policy-based monitoring for IT services hosted on data centers and public clouds.