I run a number of MySQL databases, the number of mysqld installations that I run is something like 8, but I may have forgotten some. With the number of servers that I run on a “do nothing except when it breaks” basis it’s difficult to remember the details. The number of actual databases that I run would be something like 30, four databases running on a database server (not counting “mysql” is fairly common. Now I need to maintain some sort of automated backup of these, this fact became obvious to me a couple of days ago when I found myself trying to recreate blog entries and comments from Google’s cache…
There are two types of database that I run. There are ones of significant size (more than 1GB) and tiny ones – I don’t think I run any database which has a MySQL dump file that is more than 20M and less than 2G in size.
For the machines with small databases I have the following script run daily from a cron job (with db1 etc replaced with real database names and /mysql-backup replaced by something more appropriate). The “--skip-extended-insert” option allows the possibility of running diff on the dump files but at the cost of increasing file size, when the raw file size is less than 20M this overhead doesn’t matter – and gzip should handle some extra redundancy well.
for n in db1 etc ; do
/usr/bin/mysqldump --skip-extended-insert $n | gzip -9 > /mysql-backup/$n-`date +%Y-%m-%d`.gz
Then I have a backup server running the following script from a cron job to copy all the dump files off the machines.
for n in server1 server2 ; do
scp $n:/mysql-backup/*-`date +%Y-%m-%d`.gz $n
This script relies on being run after the script that generates the dump files. Which is a little more tricky than it should be, it’s a pity that cron jobs can’t be set to have UTC run times. I could have run the dumps more frequently and used rsync to transfer the data, but it seems that the risk of losing one day’s worth of data is acceptable. For my blog I can get any posts that I might lose from Planet installations in that time period.
For the bigger databases my backup method starts by putting the database and the binary log files on the same filesystem – not /var. This requires some minor hackery of the MySQL configuration. Then I use rsync to copy the contents of an LVM snapshot of the block device. The risks of data consistency problems involved in doing this should be no greater than the risks from an unexpected power fluctuation – and the system should be able to recover from that without any problems.
My experience with MySQL dumps is that they take too long and too much system resources for large databases so I only use them for backing up small databases (where a dump can be completed in a matter of seconds so even without using a transaction it doesn’t hurt).