Stress-free production deployments: Overcoming high-complexity upgrades with Raygun
The challenge of managing complex web app upgrades
We live in an interconnected world which is often reflected in the complexity of websites. The need to share information in an efficient manner has led to a complex network of dependencies. This comes with many advantages but also many challenges.
One such challenge I want to cover today is the challenge of change. Keeping things up to date. Moving forward with changes without breaking anything.
Software upgrades can be a controversial topic as the initial impression may be that if we change nothing then nothing breaks. Why not just keep things as they are? Why is change even necessary?
But keeping things as they are requires ongoing effort. It’s like trying to keep your balance on the roof of a speeding bullet train — you need to move fast just to stay steady in a world that’s always racing forward.
Think about the most complex applications you’ve seen. How many possible user flows does it have? What are all the possible scenarios where something can go wrong?
Visualizing all of this can be quite intimidating. Considering this, is there a way to make changes with confidence, breezing through major upgrades on time and without notable issues?
Enterprise-level CMS upgrade case study
I will be showcasing the processes that leverage Raygun client for Silverstripe CMS but these principles apply to other applications that have Raygun integration as well.
Our example application will be an enterprise-size website running on CMS 4 with PHP 8.1. This application has a huge codebase that was developed over the span of several years with multiple different developers contributing. Hence, we can’t rely on a few specific individuals to recall how the site works in detail. Instead, we have to rely on existing documentation, automated test suites, and good processes.
Code coverage for our application is about 70% which is pretty high given the codebase size but it highlights that we can’t rely on automated tests only to find all the issues. Automated tests have limited coverage as they use only mock data but our website has over ten thousand pages leaving potentially many user scenarios uncovered. This is where Raygun is a valuable complement to the automated test suite which I will showcase in detail.
Our goal is to upgrade the application to CMS 5, preferably without any issues on our go-live day.
Key strategies for successful large-scale web application upgrades
1. Strategic planning: Minimizing risk in major CMS upgrades
Major upgrades need to be planned well ahead of time. All dependencies that block the upgrade need to be cleared before the upgrade can commence. It’s important to clearly define the upgrade scope. The general recommendation is to make the major upgrade as small as possible to minimize the risk. This can often be done by identifying optional upgrade parts that can be actioned later.
In our specific example, we have the option to upgrade the LinkField module from version 2 to either version 3 or 4. Our choice is to go for version 3 and upgrade to version 4 later. Same applies to the PHP version. We can upgrade from PHP 8.1 to PHP 8.2 right away as CMS 5 supports both versions. We will keep the PHP 8.1 version and upgrade to version 8.2 later.
2. Proactive preparation: Leveraging deprecation warnings and staging
Once the upgrade scope is clearly defined, we will further reduce the scope by reviewing any deprecation warnings. Many of these warnings can point us to changes that can be done ahead of the upgrade so we can action them separately, making the upgrade itself smaller and thus less risky.
Our application is running on CMS 4.13 which has the ability to report deprecation warnings. Such capability is not limited to only Silverstripe CMS though, it’s recommended to review documentation of frameworks and libraries you use and check if such capability is available.
We will use our automated test suite first to report deprecation warnings but as I pointed out earlier, this comes with notable limitations. This is where Raygun comes into play. We can use a staging environment which is a test environment that is configured to be as close as possible to your production environment while still being separate so it’s safe to turn on the deprecation warnings.
To allow Raygun to collect and report these deprecation warnings we need to get some traffic to the staging environment. This can be done in multiple ways, the preferred option depends on your setup. I will list some notable options below.
- Use a crawler to recrawl the whole website which is hosted on your staging environment
- Use a Pen test tool to generate traffic to your staging environment
- Use available background processes such as static cache generation to re-render the whole site
Whatever option you choose, you will likely end up with a large list of Raygun errors that you need to process. Thankfully, Raygun provides several tools to help you with that, highlighting the most fitting ones.
- Error filtering which allows you to categorize deprecation warnings
- CSV export which might be useful if you need to get the deprecation warnings data into another tool
- Temporary or permanent silencing of specific types of errors - this is useful for cases where you can’t really fix the code that’s causing the deprecation warning as it’s not located in your codebase
Once you’ve processed all deprecation warnings that could be actioned ahead of the upgrade we can finally proceed to the main upgrade.
3. Executing the upgrade: Balancing testing and real-time error tracking
Major upgrades do have a tendency to break a lot of things but we can leverage our reasonably high test coverage to spot these issues early and action them appropriately. Eventually, we will arrive in a semi-stable state when we can deploy our upgraded application to our staging environment for further testing.
Of course, like in the case of deprecation warnings, we will be using Raygun to find issues for us. This time, however, these will be real errors not just deprecation warnings. It’s a good idea to start by addressing the previously documented deprecation warnings that couldn’t be resolved earlier.
4. Smooth rollout: Best practices for deploying major upgrades
We’ve fixed all the issues we’ve found so we can feel really confident now and go ahead with scheduling a production release for our upgrade. I’ll list some key recommendations below.
- Ideally, avoid including any other changes in your production release for this upgrade
- Space out the upgrade from other releases (one before and one after). The spacing depends on your project setup, but for heavily cached sites you may want to separate the releases by a week or so. This allows time for changes to propagate through cache layers and you have time to detect potential issues
- Provide a rollback plan in case things go unexpectedly wrong, think about the worst-case scenario
- Document the expected impact on your users such as content freezes for content authors
Hopefully, all goes well with the production release but we’re not done just yet. Even after this rigorous preparation and detailed process, we might still experience some issues after the upgrade. This is because a good process only aims to prevent the most severe issues and minimize the likelihood of issues in general but it’s not possible to fully prevent them from happening.
Fortunately for us, Raygun is here to bail us out yet again. We should have a post-release window available where no other changes can come through to our production environment so it’s safe to assume that the vast majority of reported issues are related to our upgrade. You can probably guess the next step, it’s using Raygun to analyze, itemize and prioritize the reported issues.
Enhancing web application upgrade with Raygun
In conclusion, Raygun is an excellent tool that complements other quality assurance tools really well as it provides an additional layer of issue detection. It’s useful in multiple stages of change implementation and it’s really easy to integrate into your application. Be sure to look out for Raygun modules that are specific to your platform to make it even easier to install.
For more guidance on working with Silverstripe CMS, make sure to check out the Silverstripe provider documentation.
Raygun support
If you have a question about Raygun or their products, you can contact the support team here or read Raygun’s documentation.
Not a Raygun customer? Try out the full Crash Reporting application free for 14 days!