Next, use rsync to create a 'time machine'

Story: Mirror Your Web Site With rsyncTotal Replies: 4
Author Content
gemlog

May 21, 2006
6:40 PM EDT
Mike Rubel had a great insight on how to combine rsync with hard links a few years ago:

http://www.mikerubel.org/computers/rsync_snapshots/

I've been using this idea in my own cron jobs for clients for that long. I haven't seen it cost me more than about 20% in additional space for hourly backups. It's easy to extend the concept to daily, weekly, monthly or even yearly backups. User accessible and fool-proof, because of permissions and a read-only file system.

I have it via ssh (using keys ala the article) to linux boxen and samba for the windows ones.

I believe Mike has over a TB of data to backup for his department using this mechanism.

It's saved my own bacon a couple of times. N.B. depending on how busy your database is, you probably won't get a consistent snapshot. I do a dumpall just as often and backup *that* as well.

The only significant change I made was to add human readable symlinks for date and time, so users (especially on windows) didn't have to look for timestamps on e.g. snap.0, snap.23 in order to find the dir they wanted. i.e. my folders are named: "2006-05-20_2313-Saturday" as appropriate.

I thinks its greatest strengths are: 1) it's automated 2) users can restore their own data and 3) it's a reliable method of getting stuff both off that hard-drive(s) and/or off site in a secure fashion.

Hope this helps someone.
grouch

May 21, 2006
6:57 PM EDT
gemlog:

Very interesting reading. I've never used hard links like _that_ before. Thank you!
gemlog

May 21, 2006
7:10 PM EDT
grouch: Mike's is a very clever abuse of hard links. I wish more people knew about it, but it's hard to get them to even read the original article.

Do use the --link-dest opt for rsync. Much faster. My own boxes remote rsync a little over 5 GB each hour at 12 minutes past the hour and they all have stamps of xx:13 when they are done.

I only have one client with as much data (> 1 TB) and I don't need to back it up, but I use it daily (and weekly, monthly...) on, at least, a half a dozen sites and it works great.

Between this solution, lvm snapshots and raid, there's no excuse for any of us to sleep lightly anymore :-)

grouch

May 21, 2006
7:43 PM EDT
gemlog:

I just finished reading the whole thing. That's a very well-done document, complete with contributions, references, a FAQ and the bash scripts he used. His solution is better than the full rsync'ing I've been doing on my home LAN.

As you said, "Between this solution, lvm snapshots and raid, there's no excuse for any of us to sleep lightly anymore :-)"
gemlog

May 21, 2006
8:02 PM EDT
grouch: It's great that you see the value in it. I'm glad I posted. I changed some vars etc. to suit me as well as the excludes (huge files that don't need backup), but the thing I think helps windows users the most is mapping, say, B: 'Hourly Backups' (or daily, weekly w.h.y.) to this:

# rm old links OLDLINKS=$(ls $BAKDIR) for item in $OLDLINKS; do $RM $BAKDIR/$item done

## crx new symlinks with dts names:

dirlist=$($LS -d --full-time --time-style="+Q%Y-%m-%d_%H%M-%AQZ" $SNAPSHOT_RW/*)

cmd=''

for item in $dirlist; do dts=$($ECHO -n $item | $CUT -s -d Q -f 2) dir=$($ECHO -n $item | $CUT -s -d . -f 1-)

if [ $dir ]; then lndir=$($BASENAME $dir) cmd="$LN -s /mnt/hda8/snapshotRemote/hourly/$lndir $cmd" cmd=$($ECHO -n $cmd | $CUT -s -d ' ' -f 1-4) eval $cmd cmd='' fi

if [ $dts ]; then cmd="$cmd $BAKDIR/$dts" fi done

I'm not much of a hand at bash, but it works and produces a listing users understand. Also, the ISO dates mean that it sorts properly by default.

If you improve it or have a better idea, please post me a cp.

Posting in this forum is limited to members of the group: [ForumMods, SITEADMINS, MEMBERS.]

Becoming a member of LXer is easy and free. Join Us!