Backups
No need to go on about the importance of backing up your files. Plenty of people have done, and will do that already. Here are a few things to help you out when backing up your machine.
Using tar and ssh for a Network Backup
Tar is a very handy tool for grouping files together into a single bundle. Despite its roots as a tape archiving tool, it is almost ubiquitously used for achieving files on machines without a tape drive.
Tar can be combined with ssh to backup files across a network. The commands
below show a few ways of doing this. The first command will create a tar file
of the directory dir
on the remote machine bkMachine
. The
username bob
is provided for the ssh login, and the path and filename
of the resulting remote tar file is specified by the of
switch of the
dd
command.
> tar -cf - dir | ssh bob@bkMachine "dd of=/some/where/dir.tar"
This second command will instead of just creating a tar file, recreate the directory at the remote end.
> tar -cf - dir | ssh bob@bkMachine "tar -C /some/where -xf -"
It is also possible to compress the data sent over the network. This
following command uses the -z
flag of the tar command to use
compression. Note that -z
has to be included in both
the creating tar command and the extracting tar command.
> tar -zcf - dir | ssh bob@bkMachine "tar -C /some/where -zxf -"
Faster compression and data transfer is possible by using different
compression tools and settings. This article on comparing compressions
tools gives various ways of speeding up network backups and is worth a
read. WARNING: lzop
seems to have problems with large
files.
Using rsync
for a Network Backup
rsync
is an exceptionally handy tool for backing up files over a network.
Its project web page
gives a full description of what it can be used for. Below is just one way in
which it can be used.
The following command will copy the directory dir
and its contents
to the remote machine bkmachine
. Thus, once finished, the remote
machine would have the directory /some/where/dir/
whose contents would
match that of the local machine. Note, trailing slashes effect
what rsync
does. If dir
had a trailing slash, then only the contents
of the directory would be copied.
> rsync -va --progress --stats dir bob@bkmachine:/some/where/
The table below gives a brief description of the command line flags. Take a look at the man page for more details.
Switch | Description |
---|---|
-v | Verbose output |
-a | Archive, an abbreviation for a bunch of switches to recurse and preserve as much as possible |
--progress | Show file transfer progress |
--stats | Show statistics after command has finished |
Splitting and recombining tar files
If you've created a backup in the form of a large tar file, you can use the split command to divide it into smaller chunks that will fit onto a CD or DVD. You can then use the cat command later to recombine the pieces into a tar file.
The example below splits a large tar into a number of chunks for writing onto a DVD. The chunks will be called backup.tar.chunkXX where XX defines each chunk using letters. The first chunk is called backup.tar.chunkaa, the second backup.tar.chunkab etc... The prefixes are designed for easy reconstruction as you'll see next.
> split -b 500m backup.tar backup.tar.chunk
In fact it's easy to create the tar chunks without having to create a
complete tar file first. You can do this by using a pipe as shown below. Tar
redirects the tar stream to stdout
which split reads and creates the chunks.
This is handy if you've limited disk space.
> tar -cf - /home/me | split -b 700m - backup.tar.chunk
Reconstruction can be done with the cat command. Below is an example.
> cat backup.tar.chunk* > backup.tar
Incremental Backup Script
The script below can be used to perform incremental backups on a list of
files and directories. A text file, defined by the SRC_FILE
variable
(in this case called srcFile
) is used to declare which directories and
files should be backed up. These files should exist within the directory
specified in SRC_DIR
. BACKUP_DIR
is used to specify where the
backups should be placed.
~/bin/backup
#!/bin/bash # Source and destination (No trailing slash needed) BACKUP_DIR="/files/backups" SRC_DIR="/home/andy" SRC_FILE="$BACKUP_DIR/srcFile" LOG_FILE="$BACKUP_DIR/backup.log" LOCK_FILE="$BACKUP_DIR/backup.lock" # Test that required commands are in the path type date rsync &> /dev/null if [ $? != "0" ]; then echo "No date command" >> $LOG_FILE exit 1 fi RUNDATE=$(date +%Y-%m-%d\(%T\)) echo "$RUNDATE: Started Backup" >> $LOG_FILE if [ ! -e $SRC_FILE ] ; then echo "$RUNDATE: The source list file '$SRC_FILE' does not exist" >> $LOG_FILE exit 1 fi; if [ ! -d $BACKUP_DIR ] ; then echo "$RUNDATE: The backup directory '$BACKUP_DIR' does not exist" >> $LOG_FILE exit 1 fi; # The set -C (noclobber) prevents existing files being overwritten by redirection # The : > basically touches a file (and zero lengths it) # If the file does not exist, it will be created (set -C; : > $LOCK_FILE) 2> /dev/null if [ $? != "0" ]; then # A lock file exists so exit with an error echo "$RUNDATE: Lock File exists, check: $LOCK_FILE" >> $LOG_FILE exit 1 fi # Add a trap so that the lock file is removed on exit or a ctrl-C trap 'rm $LOCK_FILE' EXIT # ----------------------------------------------------------------------------------------- # Do the Incremental backup # ----------------------------------------------------------------------------------------- # Create the back paths BK_DIR=$BACKUP_DIR/$(date +%Y-%m-%d\(%T\)) BK_NEW=$BACKUP_DIR/bkNew BK_OLD=$BACKUP_DIR/bkOld # If bkOld exists, delete the target directory and the bkOld link if [ -e $BK_OLD ] ; then LS_OUT=$(ls -l $BK_OLD) BK_OLD_DIR=${LS_OUT#*-> } # BK_OLD_DIR is the target of the BK_OLD link rm -rf $BK_OLD_DIR # Remove old backup directory rm $BK_OLD # Remove the bkOld link fi # If bkNew exists, change link from bkNew to bkOld if [ -e $BK_NEW ] ; then LS_OUT=$(ls -l $BK_NEW) BK_NEW_DIR=${LS_OUT#*-> } # BK_NEW_DIR is the target of the BK_NEW link ln -s $BK_NEW_DIR $BK_OLD # Link bkOld to the current new directory rm $BK_NEW # Remove the bkNew link cp -al $BK_NEW_DIR $BK_DIR # Copy (archive and link) data into the new directory fi # Now do the actual backup rsync -ar --delete --files-from=$SRC_FILE $SRC_DIR $BK_DIR/ RSYNC_CODE=$? # Check rsync success if [ $RSYNC_CODE != "0" ]; then echo "$RUNDATE: Rsync returned a failure code: $RSYNC_CODE" >> $LOG_FILE exit 1 fi # Link bkNew to the new backup directory ln -s $BK_DIR $BK_NEW echo "$RUNDATE: Backup Finished" >> $LOG_FILE # All done exit 0
Below is an example srcFile
. The file used by rsync
to determine
which files and directories need backing up.
srcFile
bin code documents projects .vimrc .bashrc .bash_profile .inputrc .dir_colors
To make sure that the script does get run, it's worth setting up cron
to run it at regular intervals. The
crontab
file below will perform the backup once every half an hour. Note the use of the nice command,
to reduce the scripts impact on the system.
~/.crontab
# Min Hours Days Months Day Command 0,30 * * * * nice -n 19 /home/andy/bin/backup