Recent DreamHost problems that at one point suggested data loss got me thinking about backing up my data. Though I was not affected by the outage, some of my friends were and as they listed the data they would have lost for good I realized I would be in much the same boat if DreamHost lost my web site. Here I will outline my backup plan, including scripts, so you can get up and running quickly with your own backup strategy. The information should apply reasonably well to any web host, but will be slanted toward DreamHost because that’s the host I use.
A complete backup plan will include both the files and database contents that make up your web site. I chose to use incremental file backups along with complete database backups to balance bandwidth usage and simplicity. My backup procedure consists of two commands:
ssh email@example.com 'mysqldump --host=dbhost --user=dbuser --password=your_pw --all-databases' > $HOME/backup/dreamhost_db/`date -u +%Y-%m-%dT%H:%M:%SZ`.sql
rsync -e 'ssh -ax' -avz firstname.lastname@example.org: $HOME/backup/dreamhost/ > $HOME/backup/log/`date -u +%Y-%m-%dT%H:%M:%SZ` 2>&1
The first backs up the databases and the second backs up the files. To use them, you will need to replace
your_pw with the appropriate values. You will also need to create
$HOME/backup/log on the backup server.
The two commands are both run from the backup server, ie. the machine that you want to backup your files to. This can be any machine with a POSIX shell, with enough space to hold all your data, and that you trust to be secure (your files will probably contain your database password and other sensitive information). You can add both commands as cron jobs if you want them to be done automatically or run then manually if, like me, you want to vary when the backups occur. I run the backup scripts about once a week, which I think provides a good tradeoff between bandwidth usage and amount of data I would need to recreate if the server suddenly crashed.
The first command makes a complete backup of your databases and stores it at
$HOME/backup/dreamhost_db/<UTC_time_of_backup>.sql on the backup server. For me, this is about 10 MB so it’s not a big deal to store a new one every week or so. You may wish to delete older backups if your databases are particularly large.
The second command syncs your DreamHost files with
$HOME/backup/dreamhost and saves a log to
$HOME/backup/log/<UTC_time_of_backup> on the backup server. The first time you run the command, it will pull all your data from the server. Subsequent invocations will only pull down the changes that occurred since the last invocation. The included
-z option to rsync adds compression, which should speed up the backup process.
Note that with the options I used, rsync will not delete files in
$HOME/backup/dreamhost that have been deleted from the server, which will cause your local backup to grow over time relative to the server-side files. You may wish to add
--delete to the rsync command, but I would approach it with caution. An alternative method is to periodically transfer
$HOME/backup/dreamhost to a recordable DVD or other medium and then empty the directory and get a fresh copy of the data from the DreamHost server using the same rsync command.
Also note that some files in your home folder on DreamHost will not be readable by your user (such as
$HOME/logs/example.com/http/analog/example.com.YYYY-MM.cache) so rsync will complain. This will all be caught in the log file in lines beginning with “rsync: “. I suggest reviewing these lines to make sure that only the expected files are failing to transfer. Reviewing the rsync errors will also make you aware of other problems that may have occurred during the transfer.
If you have setup public key passphrase-less authentication with the DreamHost servers, you can run these scripts in the background and add them as cron jobs. If not, you will have to enter your password each time you run them.
I hope you find these commands and tips helpful. If you have any questions, please ask them in a comment.