The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.
Do you know this situation? You are on-call and in the middle of the night you get a phone call. Loud enough to wake you up. Loud enough to wake your wife up, as well. You stand up and check your emails to see what the problem is. OK, you got it. Then you log on to the console of your monitoring tool and – green. Green? False alert? Why did you get the call then? After double-checking, still a bit sleepy, you recognize that the problem has been recovered automatically.
AppSignal now supports the next API version of PagerDuty. 🎉 One of our devs was on support rotation the other day, and a customer asked whether we could add support for the next API version of PagerDuty. We won’t tell you who it was, but this developer typically answers questions by solving things as quickly as he can. So, two days later, boom! The improved integration for Pagerduty went live.
Today, we are excited to announce PagerTree has added 3 new chatbot services including Mattermost, Microsoft Teams and Google Hangouts Chat (this is in addition to our core Slack notification channel). Chatbots are available on all pricing tiers free of charge! :) If you don’t already have an account, sign up for a free-trial now. Our chatbots are will post alert details to a “channel” of your choice.
Earlier this year, as COVID-19 appeared, our global community of almost 800 employees became a fully remote workforce—effectively overnight. Now, all of us have had a taste of what it’s like to work from home all the time, from embracing the benefits of less time commuting and more time with our families, to the downsides of feeling isolated and missing seeing our colleagues in “real life.”
In this blog post series, I’ve explored the relationship between observability and a set of software delivery lifecycle practices that help organizations adopt DevOps practices and change their ways of working from being project centric to product-centric. I started with Site Reliability Engineering, then considered Value Stream Management (VSM) and finish with this post on Continuous Integration and Delivery (CI/CD). Defining Continuous Integration
Thales Cloud Protection & Licensing, part of the Thales Group, was looking to improve how it handles critical incidents. Whenever an incident hit just gathering up the incident team would be a cumbersome and time-consuming task that involved a lot of manual work . Multiple calendar invites would be sent to different people in and outside of the organization, multiple times, urging them to join calls and meetings.
This is the first in a two-part blog series deconstructing AIOps for ITOps leaders. If you gave me a dollar for every company that claims that they use “A.I.,” I’d be doing pretty well. But as a marketer, I can’t help but be a little skeptical about those claims. Let me explain.
This post outlines how to use Zabbix and iLert with multiple on-call teams, where each team is responsible for a set of host groups in Zabbix, and therefore, will only receive alerts for the services it is responsible for. But first, let’s start with the basic needs when being on-call.