Operations | Monitoring | ITSM | DevOps | Cloud

February 2023

Taking the fear out of migrations

Over the last 18 months at incident.io, we’ve done a lot of migrations. Often, a new feature requires a change to our existing data model. For us to be successful, it’s important that we can seamlessly transition from the old world to the new as quickly as we can. There are few things in software where I’d advocate a ‘one true way,’ but the closest I come is probably migrations. There’s a playbook that we follow to give us the best odds of a smooth switchover.

Game Day: Stress-testing our response systems and processes

At incident.io, we deal with small incidents all the time—we auto-create them from PagerDuty on every new error, so we get several of these a day. As a team, we’ve mastered tackling these small incidents since we practice responding to them so often. However, like most companies, we’re less familiar with larger and more severe incidents—like the kind that affect our whole product, or a part of our infrastructure such as our database, or event handling.

Making transparency a principle of your company's culture

You’ve probably heard the phrase “transparency is key” more than you can bear at this point—so let’s get this out of the way. Transparency is key. The phrase suddenly became that much more unbearable. But before you drop off, let me also communicate something else: transparency is often not enough. Often, companies make the mistake of leaning on transparency as a catchall solution to many of their internal comms issues.

Your non-technical teams should be using incident management tools, too

For many businesses across the world, incident management is something that’s usually left to engineers. These teams are on the front lines, declaring, managing, and resolving all sorts of incidents across the org, regardless of where it originates or what form it takes. But there’s a glaring issue with this approach. Outside of technical teams, people across organizations aren’t accustomed or trained to use the word “incident” whenever an issue comes up.

Here's what to focus on when reviewing an incident

Incidents can be a bit noisy. Especially when it’s one of higher severity, there are a lot of moving parts that can make it difficult to come away with the information you want at a glance. And if you’re someone who isn’t necessarily tapped into the day-to-day of incident response, such as a head of a department or executive, you’ll want to be able to glean the most actionable information in just a few seconds without having to dig through dense documents.