Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

July 2019 Update: Alert Opt-In and Out, Apps Section and Getting Started

July 2019 Update introduces the option to opt-out for certain categories as well as some enhancements in the Web portal. You can now opt-in/out of certain categories under Settings -> Services & Systems. This works on a per-user basis and is useful when you do not want to receive certain alerts but your team members still need to get them. Another scenario is to listen in, meaning you see what is going on but all notifications can be muted.

The Ins and Outs of Postmortem Documentation

No matter how you design your architecture or what technologies you implement, critical incidents will happen. When things go wrong, it is easy to get carried away and forget about the bigger picture. But your work isn’t done after you fix the immediate problem; now is the time to take a look at how the incident actually happened so that you can learn from it.

One Size Does Not Fit All: Tailoring Incident Response Messages to Different Stakeholders

In a simpler world, incident response notifications would be a one-size-fits-all type of item. You could deliver the same notification to everyone with equally successful results. But in the real world, incident response messages must be nuanced. Unlike baseball hats or wristwatches, the messages you send to different stakeholders when an incident occurs need to be tailored to each category of recipient.

Gain more control of your incidents with Incident Split & Merge

Usability and control are key elements in BigPanda’s Operations Console, where IT Ops, NOC and DevOps users investigate and take action on IT incidents. Let’s take a look at how BigPanda’s powerful Split Incident and Merge Incidents features provide IT Ops with an added layer of control. BigPanda’s Open Box Machine Learning correlates and enriches the overwhelming amount of alerts IT Ops needs to deal with into a reduced number of insight-rich incidents (often 95% fewer).

Top 11 Incident Response Influencers to Follow in 2019

The incident response industry is anything but static, and it is often said that the key to staying ahead is staying informed. But that’s easier said than done. Faced with the increasing sophistication of cyber attacks and the growing complexity of IT architectures, we often drown in our daily slew of tickets and alerts, with no time left to spare.

Cut Down Distractions, Reduce Stress and Focus on Critical Priorities with OpsRamp's First-Response Policies

Modern hybrid, multi-cloud, and cloud native environments have created increased management complexity for enterprise IT teams. Dynamic and distributed applications, infrastructure and business-critical services are constantly generating more data in the form of metrics, events, and alerts.

Announcing Flare: Make opening incidents stress free

We’re launching a new feature today that allows anyone in your organization to kick off your incident response process with an appropriate severity level attached from Slack. Often people are afraid to open an incident or even share that they’re aware of something going wrong with your applications. When everything is important, nothing is important; users frequently overestimate the impact of an incident and assign an inappropriately high severity level.

SecOps for the Cloud: PagerDuty and AWS Security Hub

This week at re:Inforce in Boston, the AWS team showed off its Security Hub service—a powerful service that provides SecOps teams a comprehensive view of their high-priority security alerts and compliance status across their AWS accounts. We’re excited to join AWS at re:Inforce this week as a Security Hub partner, where we’ll show users how PagerDuty and AWS Security Hub work together to provide real-time SecOps to any team using AWS.