In the software industry, upgrades are inevitable. Like cars and homes and mobile phones, software ages and needs to be replaced, either in whole or in part. Old frameworks die and new ones are born, and if you want your application to stay alive and relevant in this world you will need to give it shiny new parts from time to time.

With package managers like pip and yarn, upgrading a dependency is often as quick and painless as adding an --upgrade flag on the command line. But every once in a while, a major version bumps up and you're in for a ride.

Python 3

At SilviaTerra, Python and R power a whole suite of biometric analysis tasks, including the creation of Basemap, our high-resolution inventory of all trees in the contiguous United States. Python 2 served us well for years, and while big names in the Python community moved steadily toward version 3, we as a small developer team had to decide when to dedicate precious resources to an upgrade. Such things are always at the expense of other forward progress.

With Python 2's End of Life date fast approaching, we knew the time was at hand.

I won't go into the gory details of how we upgraded. It's the same story of blood, sweat, and tears that's been told since the dawn of the Unix epoch. If you're a Python developer, by now you've already been through it. And if you're not, the details don't matter to you anyway.

But there are some key takeaways that hold true for mitigating risk in any project, regardless of language–core principles of software development that we all know but sometimes need to remember:

Unit Tests

Good unit tests are boring to write–that's why Test Driven Development is a thing; if you leave test-writing for later, you'll never convince yourself to do it. It's the software equivalent of eating your vegetables first.

But good unit tests are also undeniably useful, even necessary. Without them, you're flying blind. Is your function doTheThing() still going to do its thing after a dependency upgrade? I don't know. And neither do you, unless it's unit tested.

Regression Tests

Unit tests give you the knowledge that your software is robust at the lowest level. End-to-end, integration, and regression tests (which I collectively lump into "regression tests") give you confidence that you've put the blocks together in a way that will stand.

During our Python upgrade, regression tests were critical in turning up very real issues such as NumPy's int64 no longer subclassing the built-in int, and pandas.argmax changing its behavior. Some issues, like these, would have blown up and cost us time to correct.

Others, such as this performance degradation in pyproj, were more insidious. This could have ended up costing real dollars in cluster compute time, but we caught it when a regression test suddenly took much longer than usual.

Time spent making your regression tests simple and repeatable makes them ready for automation using continuous build tooling like Github Actions. Invest that time.

Scope

The Python 3 upgrade was a big project, touching over a dozen repositories with code running in varied environments: AWS EC2 and EMR, Azure VM and HDInsight, and our team's individual laptops. How were we going to make sure each of our jobs would work in each of these environments?

Containerization enabled us to focus on a much narrower scope. We built a Python 3 Docker image with everything we wanted to run, and focused on getting things working locally inside a Docker container before even attempting to run on our cloud environments. At least 90% of the upgrade issues came out in these early days, when we had a quick debug cycle and a low cost of execution.

That said, we did run every job in every environment (see my earlier praise of regression testing), but by then the test runs were pretty smooth. Containerization limits the variables you need to worry about, and limiting scope to just the problem you're trying to solve–in this case, removing our dependency on Python 2–is always a good thing.

Communication

Software doesn't exist in isolation. If you're doing it right, then someone's using it. Communicating with your users is an often-overlooked part of a project. Here's a short list of the questions we asked (and answered) before starting off:

  1. What are we doing? (Upgrading Python from version 2 to 3)
  2. Why are we doing this? (Python 2 is end-of-life and the libraries we depend on will stop supporting it)
  3. Will we need users' time during the project? (Yes! Carve out time for users to perform their own testing. They will invariably find issues you haven't thought of.)
  4. Will the user experience change when we're done? (Sometimes yes, sometimes no, just be clear about what to expect.)

In addition to the big questions above, we documented and shared the full upgrade plan, posted weekly status updates, and wrote up detailed instructions for migrating local environments as needed.

Communication doesn't end with your end users. Much of modern software architecture depends on open source projects made possible by a global community of developers.

During our upgrade we found a bug in the Azure SDK for Python. We patched it immediately in our own code, but then we made sure to report it. Give back! Report issues, offer pull requests, or otherwise be involved with the software you depend on.


Software development is an ongoing balance between the risk of introducing change and the risk of falling behind the times. But with rigorous testing, well-defined scope, and clear communication, taking on a new project doesn't have to be risky. Our Python 3 upgrade is behind us, and we continue to apply its lessons as we use the latest technology to enable better management of America's forests.