BRB... just upgrading Python
CW: nerdy, technical details.
Originally, MLTSHP (well, MLKSHK back then) was developed for Python 2. That was fine for 2010, but 15 years later, and Python 2 is now pretty ancient and unsupported. January 1st, 2020 was the official sunset for Python 2, and 5 years later, weāre still running things with it. Itās served us well, but we have to transition to Python 3.
Well, I bit the bullet and started working on that in earnest in 2023. The end of that work resulted in a working version of MLTSHP on Python 3. So, just ship it, right? Well, the upgrade process basically required upgrading all Python dependencies as well. And some (flyingcow, torndb, in particular) were never really official, public packages, so those had to be adopted into MLTSHP and upgraded as well. With all those changes, it required some special handling. Namely, setting up an additional web server that could be tested against the production database (unit tests can only go so far).
Hereās what that change comprised:Ā 148 files changed, 1923 insertions, 1725 deletions. Most of those changes were part of the first commit for this branch, made on July 9, 2023 (118 files changed).
But by the end of that July, I took a break from this task - I could tell it wasnāt something I could tackle in my spare time at that time.
Time passesā¦
Fast forward to late 2024, and I take some time to revisit the Python 3 release work. Making a production web server for the new Python 3 instance was another big update, since I wanted the Docker container OS to be on the latest LTS edition of Ubuntu. For 2023, that was 20.04, but in 2025, itās 24.04. I also wanted others to be able to test the server, which means the CDN layer would have to be updated to direct traffic to the test server (without affecting general traffic); I went with a client-side cookie that could target the Python 3 canary instance.
In addition to these upgrades, there were others to consider ā MySQL, for one. Weāve been running MySQL 5, but version 9 is out. We settled on version 8 for now, but could also upgrade to 8.4⦠8.0 is just the version you get for Ubuntu 24.04. RabbitMQ was another server component that was getting behind (3.5.7), so upgrading it to 3.12.1 (latest version for Ubuntu 24.04) seemed proper.
One more thing - our datacenter. Weāve been using Linodeās Fremont region since 2017. Itās been fine, but there are some emerging Linode features that Iāve been wanting. VPC support, for one. And object storage (basically the same as Amazonās S3, but local, so no egress cost to-from Linode servers). Both were unavailable to Fremont, so I decided to go with their Chicago region for the upgrade.
Now weāre talking⦠this is now not just a āpush a buttonā release, but a full-fleged, build everything up and tear everything down kind of release that might actually have some downtime (while trying to keep it short)!
I built a release plan document and worked through it. The key to the smooth upgrade I want was to make the cutover as seamless as possible. Picture it: once everything is set up for the new service in Chicago - new database host, new web servers and all, what do we need to do to make the switch almost instant? Itās Fastly, our CDN service.
All traffic to our service runs through Fastly. A request to the site comes in, Fastly routes it to the appropriate host, which in turns speaks to the appropriate database. So, to transition from one datacenter to the other, we need to basically change the hosts Fastly speaks to. Those hosts will already be set to talk to the new database. But thatās a key wrinkle - the new databaseā¦
The new database needs the data from the old database. And to make for a seamless transition, it needs to be up to the second in step with the old database. To do that, we have take a copy of the production data and get it up and running on the new database. Then, we need to have some process that will copy any new data to it since the last sync. This sounded a lot like replication to me, but the more I looked at doing it that way, I wasnāt confident I could set that up without bringing the production server down. Thatās because any replica needs to start in a synchronized state. You canāt really achieve that with a live database. So, instead, I created my own sync process that would copy new data on a periodic basis as it came in.
Beyond this, we need a proper replication going in the new datacenter. In case the database server goes away unexpectedly, a replica of it allows for faster recovery and some peace of mind. Logical backups can be made from the replica and stored in Linodeās object storage if something really disastrous happens (like tables getting deleted by some intruder or a bad data migration).
I wanted better monitoring, too. Weāve been using Linodeās Longview service and thatās okay and free, but it doesnāt act on anything that might be going wrong. I decided to license M/Monit for this. M/Monit is so lightweight and nice, along with Monit running on each server to keep track of each service needed to operate stuff. Monit can be given instructions on how to self-heal certain things, but also provides alerts if something needs manual attention.
And finally, Linodeās Chicago region supports a proper VPC setup, which allows for all the connectivity between our servers to be totally private to their own subnet. It also means that I was able to set up an additional small Linode instance to serve as a bastion host - a server that can be used for a secure connection to reach the other servers on the private subnet. This is a lot more secure than before⦠weāve never had a breach (at least, not to my knowledge), and this makes that even less likely going forward. Remote access via SSH is now unavailable without using the bastion server, so we donāt have to expose our servers to potential future ssh vulnerabilities.
So, to summarize: the MLTSHP Python 3 upgrade grew from a code release to a full stack upgrade, involving touching just about every layer of the backend of MLTSHP.
Hereās a before / after picture of some of the bigger software updates applied (apologies for using images for these tables, but Tumblr doesnāt do tables):
And a summary of infrastructure updates:
Iām pretty happy with how this has turned out. And I learned a lot. Iām a full-stack developer, so Iām familiar with a lot of devops concepts, but actually doing that role is newish to me. I got to learn how to set up a proper secure subnet for our set of hosts, making them more secure than before. I learned more about Fastly configuration, about WireGuard, about MySQL replication, and about deploying a large update to a live site with little to no downtime. A lot of that is due to meticulous release planning and careful execution. The secret for that is to think through each and every step - no matter how small. Document it, and consider the side effects of each. And with each step that could affect the public service, consider the rollback process, just in case itās needed.
At this time, the server migration is complete and things are running smoothly. Hopefully we wonāt need to do everything at once again, but we have a recipe if it comes to that.
















