I’ll start off with an admission: I’m a relatively clueluess user of the command line in OS X. Sure, I know my way around the basics such as ls, cp, mv, and I have a working knowledge of vi, and a basic understanding of some of the more advanced programs. But that’s about it—minimal shell scriping skills, no knowledge of regular expressions, and only the most basic understanding of pipes, redirection, combining commands, etc. So I find myself regularly amazed by the power of what (for a Unix wizard) would be an amazingly simple task.
Such was the case yesterday. Earlier in the day, I’d had a bit of a scare with our family blog site (like robservatory, it runs on WordPress). Due to a mix-up on the administrative end, the WordPress database for the site was deleted. Historically, I’ve been very paranoid about backing up the macosxhints’ sites. But for whatever, reason, that same paranoia didn’t extend to my two personal sites. Hence, I had no backup to help with the problem. Thankfully, the ISP did, and the family blog was soon back online without any loss of data. But I resolved to not let this happen again without a local backup of my own.
For the hints site, I was already using cron and ssh to automate the backups, as I described way back in 2001. So it was a simple matter to take those scripts and modify them for the robservatory and family blog sites, which I quickly did. That made me feel somewhat better, but I also wanted to back up the sites’ data files. For hints, what I had been doing was to create a tar file of all the files on the site. I then used cron to create and download this file once a week. But I knew there had to be a better way.
Enter rsync, a powerful tool for create a synchronized copy of files from one machine on another machine. Operation is basically automatic, it seems, and it even uses ssh for secure file transfers. After reading through the man pages, my basic needs seemed simple enough, so I set up a directory and gave it a shot…and, as with most things I do in Unix, I was then stunned when it worked as described! The initial run took quite a while, obviously—my family blog site has nearly 2gb of data on it (mostly movies of our daughter). But after that, future rsync updates were amazingly fast. I was thrilled—I now had a local copy of the exact structure of the site, kept in sync with one (relatively simple) command. Feeling confident now, I then set up rsync updates for each of my sites, leading to this wonderfully reassuring directory on my machine:
Within each of those folders is an exact duplicate of each site’s files, along with the SQL backups from my previously-created script. The last step was to simply use cron to have my SQL and rsync backup scripts run each day. Presto! Daily dumps of the SQL file, along with a freshly-udpdate synced copy of teach sites’ files.
If you’re curious, the actual scripts look something like this:
### robservatory database backup ###
### this bit is basically identical to the 2001 hint ###
ssh -l username www.robservatory.com ./backsql.sh
scp email@example.com:backup_sql.tar \
ssh -l username www.robservatory.com rm backup_sql.sql
ssh -l username www.robservatory.com rm backup_sql.tar
### robservatory site files sync ###
rsync -avz firstname.lastname@example.org:/var/www/html \
--exclude "weblogs/" /path/to/Site_Backups/robservatory_sync
The first part of the script runs a shell script on the host to dump the MySQL data and then compress it via tar. Here's what that script looks like:
mysqladmin -u uname -ppword flush-hosts
mysqldump --add-drop-table -h localhost \
-u uname -ppword mysql_dbase > backup_sql.sql
tar -czf backup_sql.tar backup_sql.sql
The second bit is the sync step. The rsync options -avz and --exclude work like this:
- a - archive mode, which keeps symbolic links, permissions, ownerships, etc. intact.
- v - verbose output, to provide more feedback.
- z - use gzip compression to help speed file transfers.
- exclude "weblogs/" - do not back up the listed directory (it's just log files I don't really need to back up).
The final step in the equation, for this week at least, was to use the Energy Saver Preferences Panel to have my Mac wake up each evening, since I’ll be at the Expo all week. I then scheduled the cron tasks to run after the wakeup, and finally, Energy Saver puts the Mac back to sleep after a few hours, to insure all updates are done (and that I have some time to connect remotely via ssh, if I wish).
What amazed me the most is that getting this all set up, not counting the initial sync time, was amazingly fast—maybe three hours, tops, for me to go from not knowing anything about rsync to having four sites all set up and ready to go. Gads, I love OS X!