Operations | Monitoring | ITSM | DevOps | Cloud

Communicating to Users During Incidents

Imagine you're having a regular day at work, opening up your browser, double checking something for a client in that web app your team built for them, when suddenly, you see this screen: You hit refresh a few times, just to be sure. Nope. Still down. What happens next depends on how well your team has planned for incidents like this (some folks call it unplanned downtime).

The 'Decade of IoT' is off and running

If 2021 was the soft launch of the Decade of the Internet of Things (IoT), 2022 is set to accelerate IoT-related technologies and investments, addressing societal and economic issues. The rollout of 5G, maturing of artificial intelligence algorithms for streaming IoT data, increased computing power at the edge and cheaper/better sensor technology is the “convergence” that has supercharged IoT adoption.

New in StatusGator: Reordering Services

As our public status dashboards have become more popular, so has the ability to customize them. Over the next several weeks, we will be rolling out a series of features that allow more customization of your dashboard. Already we’ve added custom CSS capabilities. Today, we’re rolling out service reordering. Our new dashboard management page has a slimmed-down look.

The importance of SemVer for your applications

For some developers, SemVer can look just cosmetic, nice to have, or simply useless. But SemVer format is mandatory to make reliable software. I'll explain how over one year, we encountered 2 issues related to SemVer. The first one was critical and led to a production outage, while the other was a lot of trouble for several companies to upgrade a managed service.

Improving your team's on-call experience

Your engineers probably dislike going on-call for your services. Some might even dread it. It doesn't have to be this way. With a few changes to how your team runs on-call, and deals with recurring alerts, you might find your team starting to enjoy it (as unimaginable as that sounds). I wrote this article as a follow-up to Getting over on-call anxiety.

The Observability Pipeline

Today’s systems are more distributed, dynamic, and complex than ever before – plus, users have more expectations. Also, the historical reliance on an operations team to monitor, triage, and/or resolve issues has become untenable as the number of services increased. This means that many of the tools that were well-suited before might no longer be adequate.