BU Filesystem backup Features




BU Filesystem backup Features

Simple and convenient to use

Bu is very simple and convenient in every day use. Just type

bu

with no arguments and all file and directory patterns listed in the Include file will be incrementally backed up to the backup filesystem. Although there are a number of command line options, bu is very configurable via the rc file and/or environment variables so that you should rarely have to specify any options.

Backups are always screened by the Exclude list automatically.

To back up specific files or directories and ignore the Include list, just type

bu file file ...

This will only back up files that are newer than the ones on the backup filesystem unless you turn off incremental mode.

 

Why not just do recursive copies or use cpio, tar, or pax?

In addition to the log files, include/exclude filtering, very simple command line, and other convenient features, bu has quite a few sanity checks and gracefully handles symbolic links. There is much more to this than simply using ``cp -a'' or ``cp -dpR''.

For example:

If you say

bu /a/b/c/d

and b is a symbolic link to /foo and d is a symbolic link to /bar, which is a directory, then bu will expand every link in the specified path and back up the contents of the directory /a/foo/c/bar to /backup/a/foo/c/bar, but any symbolic links under the directory you specified (i.e. /a/foo/c/bar) will not be followed. This way it always keeps the backup filesystem identical to the original, no matter if you specify files, directories, symbolic links to files, or symbolic links to directories to be backed up.

For example:

cp -R /a/b/c/d /backup/a/b/c/

or

cp -a /a/b/c/d /backup/a/b/c/

would have just copied the link, /a/b/c/d to /backup/a/b/c/d.

cp -r foo /backup

where foo is a symbolic link to a directory will create /backup/foo as an actual directory and then traverse any symbolic links under it, which is also not what you want.

In other words, with bu, you can spontaneously backup any file or directory (specified as the actual directory or as any combination of links to the directory) without being concerned about paths or having to specify the destination, and trust that the backup filesystem will be an image just like the original. Of course your backup filesystem will be free of files in your Exclude list, such as browser cache files, tmp files, core files, etc.

To dump the default backup filesystem to CDRW's just type

bu --dump

When dumping a backup image filesystem to CDRW, the Exclude filter is not used since it is assumed the files have already been filtered through bu. Turn off the bu_backup_image setting to dump local filesystems. This will cause bu to use the Exclude filter and retain the full path of each file. Then, for example, to dump your /home and /opt filesystems, type

bu --dump /home /opt

 

Flexible and forgiving command line

Bu's command line options can be specified in any order. Switches can be specified at the beginning of the command line, in between filename arguments, and/or at the end of the command line.

Short or long versions of most of the long form switches will work. Also, even some common typographical errors will work. For example --label will still work if specified as --lable.

A few examples:

 --directory        can be specified as  --dir
 --exclude_file     can be specified as  --exclude  or even  --excl
 --one_file_system  can be specified as  --one

If it seems intuitive, it will probably work. The reason for this is to make it easy for users to remember command line switches without having to always get it exact. I do not like rigid command lines. What is easy to remember or intuitive for one person is not always for another.

 

Generates detailed log files

Bu keeps datailed logs of all backups in the directory,

  /var/log/bu

Of course, the log file names and locations are configurable. The logs include such information as host name, backup mode, start and end time, specified files and directories, and a list of what files got backed up.

It also stores CDRW dump date information for incremental and differential dumps in

  /var/log/bu/dumpdates

The dumpdates file is automatically generated and updated by bu when doing CDRW dumps but it can also be edited with any text editor if needed.

 

Highly configurable

Bu is highly configurable through 3 configuration files.

  ~/.burc                    Runtime configuration
  $prefix/etc/bu/Include     Default list of file or directory patterns to
                             include in the backup if not specified on
                             the command line.
  $prefix/etc/bu/Exclude     List of file or directory patterns to exclude
                             from the backup.

The locations and names of the Include and Exclude files are also configurable in the rc file. Alternate rc files can also be specified on the command line or in the environment.

Most of the settings in the rc file are also configurable as environment variables.

 

Multi-user

Bu is fully multi-user. It creates unique tmp and log files for each session so that multiple backups may be run on different filesystems and/or CDRW dumps to multiple CDRW drives all simultaneously. This has several advantages.

  • Multiple users and/or multiple root administrators may run bu simultaneously on different filesystems, or even on the same filesystem without interfering with each other or having to coordinate with each other.

     

  • If a larger backup is already running and you are not sure if the latest changes to specific files that you want backed up immediately have already been backed up, you can run bu on those files without interfering with the existing backup process.

     

  • If you have more than one CDRW drive, you can dump multiple filesystems to different CD's at the same time as long as your system has the processing power to handle it. CDRW writes can fail or end up corrupted under heavy system load because CD's require a continuous write stream. However, bu tries to minimize the chance of that happening by running the CD write process at a higher priority.

     

 

Ability to limit network load

If it you need to do a large backup and do not want to load a network down for your users, bu has the ability to slow down the backup and lessen the bandwidth consumption by copying groups of files and sleeping in between. Using two variables in the rc file, you can tune how many files to copy at a time and how long to sleep between each group of files.

 

Dynamic help

The --help switch not only shows all command line options, but also shows most of the default settings based on the current settings in ~/.burc and in the environment.

 

Restoring files

For filesystem backups the cp command is all you need. I may add restoration features in the future as well.

For CDRW dumps, I plan to add sophisticated interactive restoration features, but for now, files must be restored manually. This is easy to do using standard Unix commands. See CDRW Data Format for details.

 


BU CDRW Dump Features

 

Data format

Bu uses a standard ISO9660 filesystem with Rock Ridge extensions with a compressed tar file and several information files. See CDRW Data Format for the datails.

 

Multi-volume

If the data will not fit on one volume, bu will automatically use multiple volumes. The volume size is configurable in the rc file. The default size is 650 Mb for a standard CD.

 

Log files that can be used as CD jewel case labels

Bu creates a unique log directory for each removable media dump under /var/log/bu named dump.<date>[-x]. For example, if two dumps are done on 5/8/04, there will be two directories, dump.050804 and dump.050804-1. Each directory contains a master file list for all volumes and an info file for each volume. The info files are the same as described under CDRW Data Format except each one shows the total number of volumes as well as the volume number. These info files are designed to fit nicely in a CD jewel case for printing and using as backup labels.

Also, this information is logged on the hard drive so that dump information can be queried to find details about backed up files and what volume they are on quickly without having to mount the CD's and search them. This information will be utilized later by bu when I add interactive file restoration features.

 

Good performance

Bu tries to provide as much performance as possible when doing a dump to CD. For example, it does not wait on the drive to become ready to write to as long as there is other work to be done. You will notice, if you put a CD in and start bu without closing the CD tray, that bu closes the tray and starts processing files immediately before the tray is even closed.

It also does whatever it can simultaneously, such as generating the volume file list and blanking the CD if blank mode is on. You will find it takes no longer, most of the time, with blank mode on than with it off.

 

Fault tolerant

Bu is designed to be as fault tolerant as possible.

When running as root, it increases it's process priority while actually writing to the CD to decrease the chance of getting a write failure or corrupt CD under heavy system load.

Also, bu traps INT and TERM signals, cleans up it's temp files and exits gracefully, leaving a note in the error log that it was aborted.

If there is an error writing or blanking a CD, bu will prompt to re-write the same volume rather than aborting and making you re-do the entire dump. This is really handy when you have been dumping for hours and on volume number 9 you got distracted and forgot to put another blank CD in the drive before hitting <enter>.

 

Ability to do incremental, differential, and full dumps

Incremental mode dumps all files that have changed since the last dump of any mode.

Differential mode dumps all files that have changed since the last full dump.

Full mode dumps all files.

Bu is configured to use Differential out of the box. This can be changed in the rc file.

 

Automatic blanking

If you turn on blank mode, CDRW's that already have data on them can be used without erasing them ahead of time. Bu uses the fast blanking method for this.

 

Auto eject

When auto eject is on, bu will automatically eject the media between each volume and at the end of the backup.

 

Ability to set a text string to use as a backup label

If set, this string is added as a label field in the info file for each volume. This label can be displayed by interactive restoration tools and user interfaces.

 

Configurable write speed

The write speed defaults to 4x out of the box, which works with all but very old CDRW's. You will likely want to increase it for modern writable CD's,

 

Data compression

Compression is done with gzip and is on by default. It can be turned off in the rc file.