Operations | Monitoring | ITSM | DevOps | Cloud

Alerting

Major IT Outage 2021 Recap

We saw that no one is immune from major IT outages in 2021, not even mega titans like Google, Facebook, and Amazon AWS. The following is a recap of some of the major IT outages with widespread impact for 2021. Amazon Web Services’ (AWS) historic outage occurred on December 7, 2021 and lasted roughly 6 and a half hours. The breadth of Amazon and its reach caused not only their warehouse and delivery operations to stop.

Slack outage

Slack, a popular enterprise communications platform, faced a 5-hour system outage yesterday between 9:25 AM – 2:24 PM EST on February 22, 2022. Slack services affected included: messaging, search, link previews, apps/integrations/APIs, posts/files, workspace/org administration, login/SSO, notifications, connections, and calls. AlertOps was NOT affected by this outage.

Cloud Incident Management Guide

It is a well-established fact that companies looking to grow in the digital age can facilitate this mission by adopting the cloud. When pursued with the right intent and implementation strategy, cloud adoption acts as a powerful force multiplier, yielding a cutting-edge IT powerhouse for businesses and helping them grow and innovate at an accelerated pace. Organizations that adopt a cloud-first strategy must safeguard themselves from critical, service-disrupting incidents.

Sprint planning - How to prioritize urgent production issues?

Small engineering team members wear a lot of hats while working on a product. It becomes hard to prioritize and deal with issues that arise during production when a sprint is already planned and put in place. This not only makes sprints harder to plan but also reduces accountability. How do you tackle this problem and make sure your engineering team does not burn out at the same time? Let’s list down a couple of characteristics of this engineering team that is quite common across the board.

Cut Out the Noise: Issue Grouping and Alerting Best Practices

We’re drowning in emails and Slack notifications. As our eyes glaze over, we start bulk-archiving everything into folders we most likely never go into again - missing critical bugs, crashes, or slowdowns sometimes weeks too late. Learn from Dustin Bailey, Solutions Engineer at Sentry, and Phillip Jones, Ecosystem Product Manager, as they share issue grouping and alerting best practices to help cut out the noise so you can start taking action on issues faster.

February 2022 Update - Centralized and time-based notification patterns

With our February update, it is now possible to centrally configure how Signls should be notified. And of course, each team can have a different configuration of their notification preferences. This also includes response and escalation settings. In addition, it is now possible to set different notification patterns per day and time of the day, e.g. to notify via different channels at night than during office hours.

Exploring the Importance of Change Management in Healthcare

Change management is an organized, structured approach with methods that enable healthcare organizations to transform workflows seamlessly. Organizational change management requires the collective involvement of C-level executives and stakeholders to successfully implement changes within a care facility. Change is required when individuals, processes, teams, and tools cannot keep pace with the ever-changing needs and expectations of the organization.

Can your AIOps platform do Log Noise Reduction in addition to Alert Noise Reduction? If not, it is time to re-evaluate your AIOps

One of the core value propositions of AIOps platforms is to increase IT efficiency & productivity by applying AI & ML techniques to perform Alert Noise Reduction. This in turn translates to direct cost reduction due to savings in IT man-hours. In this approach, the AIOps platform kind of becomes like a gatekeeper for all the IT alerts/events, and it can help effectively, reduce and correlate such events, so as to send meaningful incidents to NOC or Service Desk.