I’m tracking the backup systems for my various projects and noticed it is all over the place. This topic will outline approaches to each system, which I’ll use to expand into implementation details.
Why create backups?
Primarily to restore a project to operational status when there is an issue that brings the system down which resulted in missing or difficult to obtain data.
Yeah, that’s about it.
When one computer breaks, let’s copy all the files to a new computer and get it running again.
Systems to backup
Here are the systems I’m backing up, with notes on each.
Discourse
I have several Discourse instances, and while each one has important communications, none of them are used for long term storage of media. That simplifies the backup on my part a bit.
When hosted at discourse.org, I see this message on the backups page:
Off-site backups for disaster recovery are created every 12 hours. To download a backup manually, see our documentation. If you encounter an emergency and need to roll back your site to earlier version, please contact us.
Since the sites are backed up very often and I don’t need to restore uploads if a site goes down, it means I’m okay downloading a backup from each site, once a month.
To do this I’ll set a scheduled API call to activate the manual backup, which will then send me a PM as the admin on each site.
Hugo
Hugo is a static site generator. It integrates with git.
So backing up these sites will require me committing those projects to a git repo!
My current homepage was created while my laptop was malfunctioning due to an expanding battery; that chassis is still being repaired, so I had to do some copying around real quick, and I did not have git setup consistently… anyhow, the files are still backed up, just not in version control.
This is not great, but not horrible: it’s a single page with four sentences, not much is happening.
However, I recently got aider
working, and I’m looking forward to doing more with Hugo, and that will definitely benefit from using git.
I will sync these repos to a code forge, I’m still undecided which. GitHub is very popular of course, but I use it for work and it’s nice being able to use a separate interface for my personal projects. Since my projects are composed of open source code and public domain content (the stuff I produce), I’m fine putting it on an instance somewhere those values are shared and important.
Piwigo
Piwigo consists of software files (which I could download again), uploaded media, and a database.
Fortunately, the instances I host are all made of completely public media. That means I don’t need to protect backed up copies of media, there is no secret data included; the database potentially has values I don’t want to share.
I need to look at an automated way of stashing backups of the database somewhere I can grab it… such a script which dumps the db, copies to a secure bucket, and sends me a link in chat to download. A monthly cadence.
For the uploads I’ll probably rclone
them to an inexpensive bucket somewhere. Run it daily since it will only update changed files.
WordPress
I have a single WordPress site; unless that project turns around in some fundamental way, this is the last WordPress site I’ll be personally hosting.
Since this one is being slowly archived and migrated, one more backup should cover everything; afterward the media library and database will slowly decrease until empty.
I’ll put together a script with wp-cli
to export and backup all the things. Download to local backups.