Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Tips for Modern NOCs - Correlating Incidents to the IT Changes that Caused Them

Every NOC engineer will tell you that the first thing they look for in an outage is “what changed?”. And they are right to look. While every organization is unique, Gartner reports that on average about 80% of IT incidents today are caused by changes in infrastructure and/or software.

OnPage Overrides Silent Switch on iOS and Do Not Disturb Mode

Since its inception, OnPage has been dedicated in providing a powerful critical alerting solution. This mission continues in 2020, as OnPage is pleased to introduce its ability to override the silent switch and Do Not Disturb (DND) mode on iOS. The latest advancements ensure that tasked recipients always receive high-priority, OnPage audible alerts, regardless of their current iPhone settings.

When Incidents are not investigated, Problems await

Incident and Problem Management are two very different issues in IT service management that are unfortunately often used interchangeably. On the surface, it might just seem like a matter of terminology. But, what if you get to know that one is a small hiccup and the other could dent your entire quarterly or annual results?

Fostering Psychological Safety in Remote Teams

Psychological safety is a crucial component of any organization’s culture. Psychologically safe organizations are free to create, discuss, disagree, take risks, and make mistakes. These organizations are often the ones we see as key innovators in their unique industries. In other words, cultivating a culture of psychological safety is paramount in order to succeed.

On-call On-boarding Checklist

And it starts with the company culture. Irrespective of how small or large your team is, it’s wise to invest some time in creating a good on-call onboarding plan. A humane on-call is the mark of a good engineering culture. Being on-call means that you’re expected to be reachable for any issues that may occur during your shift. It’s easy to lose any and all motivation by just anxiously anticipating that mid-dinner ping.

Creating powerful automations with n8n and Mattermost

Tanay is the Head of Developer Relations at n8n. He has published books on WebVR, virtual assistants on Raspberry Pi, and FirefoxOS. He has been listed in the about:credits of the Firefox web browser for his contributions to the different open source projects of the Mozilla Foundation. I’ve been involved in the DevOps world for a while and yet I finished reading The Phoenix Project only recently. The book piqued my interest in how teams execute their incident response playbooks.

Announcing Our Series A

It’s Friday at about quitting time, and my plans for the evening involved a great cocktail, hanging out with friends, and maybe continuing to binge The Office. Sadly, there was a problem. Our alerting system detected an enormous and immediate spike in errors. The error description was along the lines of “table ‘servers’ does not exist” and thousands of customers couldn’t use a large cloud provider’s services.

Driving Real-Time ChatOps With PagerDuty and Microsoft Teams

With over 75 million daily active users, it’s safe to say Microsoft Teams is essential to many global businesses. On top of that, Microsoft CEO Satya Nadella recently shared that Microsoft saw 200 million meeting participants in a single day this month. While Microsoft Teams’ explosive growth can be tied to recent spikes in remote work, many enterprises have relied on Teams to connect people across the globe for quite some time.

The MSP's competitive edge: a new standard of incident response

In the effort to streamline operations and enhance cost efficiencies, organizations large and small are turning to managed services providers (MSPs) to outsource key IT activities. In fact, this shift is so broad reaching that the global MSP market is expected to exceed $375 billion by 2025! Indeed, outsourcing to MSPs brings many advantages.