Incident.io

London, UK
2021
  |  By Chris Evans
On August 28th, 2023—right in the middle of a UK public holiday—an issue with the UK’s air traffic control systems caused chaos across the country. The culprit? An entirely valid flight plan that hit an edge case in the processing software, partly because it contained a pair of duplicate airport codes.
  |  By Lawrence Jones
Picture this: your alerting system needs to tell you it's broken. Sounds like a paradox, right? Yet that’s exactly the situation we face as an incident management company. We believe strongly in using our own products - after all, if we don’t trust ourselves to be there when it matters most, why should the thousands of engineers who rely on us every day? However, this poses an obvious challenge.
  |  By Martha Lambert
At incident.io, we run on a monolith. This brings a whole load of benefits that we don’t want to give up any time soon. We don’t have to worry about the speed of internal network requests, complex deployments, or optimizing work that touches multiple services. This blog post isn’t about the relative benefits of monoliths though (but we’ve written more about that here if you are interested)! Ownership in monoliths is tricky.
  |  By Lambert Le Manh
As a provider of incident management software, we at incident.io manage sensitive data regarding our customers. This includes Personally Identifiable Information (PII) about their employees, such as emails, first names, and last names, as well as confidential details regarding customer incidents, such as names and summaries. Consequently, we approach the management of this data with a great deal of care.
  |  By Jack Colsey
We've written several times about our data stack here incident, but never about our underlying data warehouse and the design principles behind it. This blog post will run through the high-level structure of our data warehouse and then will go in-depth into the underlying layers.
  |  By Pete Hamilton
Writing a meaningful update for customers every week has been held sacred at incident.io since we started the company. We've written over 200 of them in the past 4 years, and we recently celebrated going 2 years straight without missing a single a single week The numbers themselves are not the goal, but the consistency of this habit and what it represents for our customers and our team is very real, and special to me.
  |  By Sam Starling
With every job I have, I come across a new observability tool that I can’t live without. It’s also something that’s a superpower for us at incident.io: we often detect bugs faster than our customers can report them to us. A couple of jobs ago, that was Prometheus. In my previous job, it was the fact that we retained all of our logs for 30 days, and had them available to search using the Elastic stack (back then, the ELK stack: Elasticsearch, Logstash, and Kibana).
  |  By Milly Leadley
Indexes can make a world of difference to performance in Postgres, but it’s not always obvious when you’ve written a query that could do with an index. Here we’ll cover.
  |  By Charlie Kingston
The Digital Finance Strategy is a European directive that aims to support and develop digital finance in Europe while maintaining financial stability and consumer protection. There are three main components to the package: In this blog post, we’ll attempt to summarize the 113-page DORA proposal, highlighting how it will apply to incident management at financial entities. Side note: we also wrote a blog post about the other DORA, also known as the DevOps Research and Assessments.
  |  By Chris Evans
The origin of incident.io goes back to our days building Monzo, a UK-based bank, where Stephen, Pete, and I first crossed paths. As a bank, compliance with numerous regulations was, unsurprisingly, a top priority. When it came to incident management—something we were very involved in—this meant that every aspect of reporting, policy adherence, and root cause analysis (or "contributing factors," as we called it) had to be managed consistently and meticulously.
  |  By Incident.io
This week we walk through writing post-mortems in the app, from resolving the incident to building a comprehensive post-incident summary directly in-app.
  |  By Incident.io
Like it or not, AI is having a monumental impact on our lives. Most of the products we engage with today have AI features and functionality, aimed at assisting or completely replacing the actions normally taken by humans. When it comes to incidents, we’re firm believers of accelerating human actions, and believe the risk of over-automation far outweighs the benefits. In this live event we’ll dig a little deeper on why, as we cover the power and pitfalls of AI.
  |  By Incident.io
In this episode, Norberto (VP of Engineering) and Lawrence (Product Engineer) delve into the recent CrowdStrike incident that began on July 19th. Rather than focus on technical specifics, they provide a thoughtful exploration of key aspects that matter to us at incident.io, such as effective communication, overall response strategies, and proactive problem-solving during crises.
  |  By Incident.io
Gone are the days when incidents were manual to resolve, invisible to customers, and overall viewed with a negative lens. This is part two of the virtual event series as we dive into our fresh take on what incidents should look like, The Incident Way, and hear from customer stories putting these principles into practice.

Create, manage and resolve incidents directly in Slack. Leave the admin and reporting to us.

Improving your incident response, visibility, and ability to learn:

  • Less faffing, more fixing: We take care of the admin during incidents, so you can save your brainpower for the decisions that matter.
  • Divide and conquer: We make sure everyone’s role is clear, track who’s working on what, and help you escalate if you need extra help.
  • Get up to speed, at speed: Get everyone on the same page from the moment they join the incident, and help stakeholders stay in the loop.
  • Timelines, in no time: Constructing an incident timeline for review is important, but time consuming. We’ll build one for you in real-time, and keep it constantly up to date.
  • Data and insights you can trust: You’ve already paid for your incidents. By surfacing the data you need to make decisions, we help you get your money’s worth.

Incident response for your whole organisation.