How-To: Automatically back up your computer
We've seen plenty of crazy ways to keep your precious data safe. Some people burn a few tons of DVDs, others make a montly habit of swapping hard drives into a safe location. In today's How-To we'll show you how to automatically keep your data backed up from your computer with ssh and rsync. Feel that? That's our warm comfy safe-data blankie. Check it out. What about backup software? There are so many flavors of software, only Google could count them all. Since we like our choice of operating systems, we want something that'll work from our Mac, Windows or Linux machines. But we'll cover some good software backup options next times -- for now it's the down and dirty, nitty gritty network backups.
First of all, we're going to need somewhere to keep our data. For the tools we'll be using, we'll need a server that we can access via secure shell (ssh) from anywhere on the net. If you only want to backup your stuff at home, that's fine.
For our first example, we'll be using Ubuntu Linux on our laptop with our Linux web server. You can get an inexpensive shared host from a web host provider or roll your own like we did. Our particular backup solution is to update a copy of our data as rather than take incremental snapshots over time. You can do it either way, but for our needs, we just want our current data set kept alive.Once you've decided where to keep your data, and what you want to backup on your laptop or workstation, you'll need the tools to keep things rolling.
The heart of our -- and many others' -- cross platform backup is a combination of ssh and rsync. The secure shell is probably the most useful networking application ever. We'll use it to transport our data securely to our backup location. To update the files, rsync will be used -- it's designed to copy and synchronize data from one location to another.
For our example, we'll backup /home/willo/data to our server. The top directory 'data' should be created on the server. We'll use an Ubuntu laptop, if they're not already installed, you can easily install rsync and ssh with this command:
sudo apt-get install rsync openssh-client
This command will copy and update the data inside /home/willo/data to our server's directory /home/willo/data. When it's run by hand, it requires the ssh password for user willo on the server. Not a big deal, but when it's automated, we won't be there to enter the password.rsync -avz -e ssh data willo@biobug.org:/home/willo
To get around the password requirement, we need to create a pair of ssh keys. The keys will allow our ssh connections without user intervention. (This also means that someone else could connect if they get your key...)Here's the command for copy/paste ease:
ssh-keygen -t dsa -b 2048 -f /home/willo/backup-key
The command creates a pair of keys. The private key will allow us to open our connection. We'll need to copy the public key to our destination/backup server. Once it's there, we added it to our authorized keys file. If you don't have one, just rename the file instead of appending it.
Now we can run our backup command without a password, and a simple command. Again, here it is for quick cut and paste:rsync -avz -e "ssh -i /home/willo/backup-key" /home/willo/data willo@biobug.org:/home/willo/data
As always, replace the paths at will. (Har har.)
Now we'll put our backup command into a script so we don't have to remember every detail. This is the quick and dirty version. We ran vi backup.sh and added a bash header and our script. We'll save it as backup-data.sh and run chmod 500 backup-data.sh so we can execute it, but other users can't look at it.
Cron is a scheduling program. We can schedule software to run as often, or as rarely as we want. To regularly run our backup program, we'll create a cron job. Use the command crontab -e to edit your crontab. The first five entries determine how often to run the job, and the command follows. In this case, we'll sync our data every 30 minutes.Now you know how to do it from a linux box, but how about Windows? You can do the same thing by installing cygwin - it's a set of unix tools built to run under the Windows environment. Download the installer here.
Run the installer and step through the process. When you get to the package selection window, you'll need to select cron, ssh and rsync.
Once the installer finishes, open up the cygwin bash shell program. From here you'll be able to perform the same steps we outlined for linux. The only real difference is in the directory names. We suggest putting your data in an easy location like C:\data. Then you can use /cygdrive/c/data in your commands instead of the usual c:\data.
You can do the same things under Mac OS X, all the tools are already installed. Just navigate to Applications/Utilites and open up the Terminal program. After that, you can follow the same instructions. A work of warning: rsync's support of resource forks has been an issue. You'll probably want to look into using rsyncx. If you're dealing with simple data like image files, normal rsync should get the job done.
Now that you know how to keep your data backed up to a server, where should it go? Well, how about to an off the shelf NAS like the Buffalo Terastation? With a few modifications, we used the same solution with ours.
A visit to the Terastaion wiki turned up a few hacks that opened up the boxes latent abilities. We installed firmware from this page to gain telnet and root access to the box. The updater is pretty large, but it worked just fine for us.
After the update, we opened a telnet session to our Terastaion. (We gave our a static ip and set the gateway and DNS settings.) To quickly and easily install ssh, we used the following commands logged in as myroot:cd /home
wget http://www.terastation.org/files/dropbear.tgz
cd /
tar zxv /home/dropbear.tgz
reboot
With ssh running, we created a user using the normal control panel. All users are defaulted to home, but you can edit /etc/passwd and provide something like /home/willo if you want to keep things separate. Create /home/willo/.ssh, copy the public key to authorized_keys. After that, decide where to keep your backup. Don't use the home directory -- put it under one of the normally shared directories under /mnt/array1.
If the worst should happen, or you just need a copy of your data, you can snag it using a couple of tricks. To securely copy the whole shebang, scp is the easiest. (scp - secure copy is built into openssh) You can use a key, but your password will work just as well.
If you've got most of your data on another machine, but want to update it with the latest changes from your working copy, you can use rsync, but reverse the source and destination. Again, you can use the key, or not. It's your choice. (Just don't try to run this by and and have the crontab rsync running as well.)It's important to keep the limitations of this method in mind. If you're working with huge files, then you'll need some major bandwidth. You probably won't get that 2GB file copied to the server from the local coffee shop's DSL connection. Thanks to rsync, you can pretty easily add or update smaller data files. Even if the upload doesn't complete, rsycnc will pick up where it left off the next time it connects. Keep your eyes out for part two, when we'll look at a few backup options that don't require a terminal to use.




















The best NAS I've seen if your on a Mac is the ReadyNAS NV+. It supports 4 SATA drives, up to 500GB each. Supports HFS+ as all Mac users want. And talks over AFP, SMB/CIFS, and a few others. Most NAS drives I find don't support AFP, so finding one that did was very nice.
Ditto on the ReadyNAS NV+. I have one in a WIN and OS X environment and absolutely love it. The model I use has 4 x 750GB drives in RAID 5.
Having a mirror image of your source data does not help very much. If the data becomes corrupted or removed, rsync will mirror the exact image (hence deleting/corrupting the data as well). I highly recommend using rdiff-backup which utilizes librsync and ssh, but will keep archive backups of all your data. That's the only way to truly safeguard against data loss (offsite, archive backups, encrypted connections).
And if you want to save huge headaches, use a professional service like ChoiceBackup: www.choicebackup.com
They further encrypt all the data and have a very expensive infrastructure with redundant servers. You get what you pay for...
You can't be serious about your recommendation of ChoiceBackup.
The prices are unreasonable.
300$ for storing 2GBs for a year ???
Nearly 800$ for 5GB ???
Are you affiliated with that website?
ViceVersa!
That is the most complicated backup scheme I've ever seen. I haven't seen many, but they aren't nearly this crazy. They also don't start out with "we'll need a server".
I was under the impression that the rsync installed in Tiger included HFS support. Is that not correct?
Yeah, you could do all that or you could just download and install Mozy (works on Mac and Windows) and backup 2GB to a remote server for free...
or put a hardware RAID 6 in if you're that serious...
I use one, and all of my data has survived one HDD failure so far
So you need to run cygwin on the box for Windows? That's weak. Just run run robocopy from the resource toolkit. Does the same thing rsync does and you can schedule it from the Task Scheduler to do automatic periodic backups as well.
I do daily incremental backups to an external HD (easy to grab if I'm home and there's a fire), and every few months create a DVD set of important data (digital photos, music, documents).
I've been lucky enough not to have to deal with a HD crash in a while though (knock on wood). ;-)
Yyyyyyeah, or you could just spend like $40 on a good backup software package for Windows. I use Backup4All, and like it quite a bit.. handles stuff like restoring multiple revisions of a file, backing up to a folder/disk or FTP site, and more.
I was originally looking at the ReadyNas devices, but decided they were too pricey. I instead found some old base parts (old cpu/mb/memory) bought some SATA drives and a highpoint raid 5 card, and then installed FreeNAS (http://www.freenas.org/) on it which is a FreeBSD NAS distro, it supports all major transfer methods including RSYNC which is mentioned in the article, has a nice small install footprint (~35MB) and is accessible over the web. I'm using 4 500GB SATA drives so after formatting and RAID 5 have ~1.3TB of space for just the cost of the RAID card and drives.
Acronis TrueImage Home is awesome.
I make a restore set of DVDs right after an initial install for each of my PCs, and keep an incremental backup which is updated every few weeks or so (or after I've been working on an important project). I keep the most recent complete backup on an external HDD and burn a copy to DVD. Works great.
rsync is old hat to a lot of us - and even though a lot of folks have graduated to rdiff-backup or duplicity or unison, I still like the KISS philosophy and still use rsync for everything.
All of my personal machines, and as of late, all of my employers servers have their data replicated to offsite backup with rsync.net. We looked into strongspace and exavault, as well as the ridiculously overpriced offerings from places like Iron Mountain ... and in the end it was this:
http://www.rsync.net/philosophy.html
and this:
http://www.rsync.net/resources/notices/canary.txt
That sold us.
Now to get cygwin+python+duplicity running on my last remaining windows system...
I appreciate the effort you guys went through, but seriously, Linux? When 95% of the market is Windows, a backup how to using Linux seems like a waste of your time and efforts. Sorry, just being honest.
Uhm....as far as a backup, what about going to an 500GB external USB drive. Pair that up with Cobian backup, a free backup utility, you've got a winning Windows solution.
Most people running Linux already know how to make regular backups. How about a how-to for the masses that use Windows?
That being said, this type of article does not really belong on Engadget.
Or.. gee.. use the standard backup utility built into Windows and just point it at the NAS and install a TaskScheduler entry to run the backup utility periodically? And NTBackup does a proper shadow copy so you get all the state, substreams and permission information saved with the files. Most people don't know that Windows has something like the old MacOS file forks, only it can have any number of them. If you use a utility to copy the file that doesn't know about them, you'll lose those substreams - or at least in this case, you don't back them up.
Here are the command line arguments for NTBackup.
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/ntbackup_command.mspx?mfr=true
Hammer and nails, people...
Do any Windows builds of this support Volume Shadow Copy? It's really useful for backing up open files like Outlook PSTs.
You consider installing all that stuff on Windows, including a scripting language and then relying on an Internet connection to some unknown third party company 'simple'? Ignoring the fact that many Internet providers have bandwidth caps, the TIME involved in any significant backup will be horrendous. And then you're paying some one else to store your stuff. This is neither simple, efficient or particularly cost effective.
Simple is getting a 500GB hard drive in an external case ($200) and using an NTBackup, which is free and comes with Windows. Nothing to install and you can keep the backup drive somewhere safe when it's not being used. If you're REALLY paranoid - get two 250s and swap them regularly.
If you have Vista Enterprise or Ultimate or MacOS X 10.4, you can even have it handle all the scheduling or do 'constant backup' which not only does constant diff backups, but keeps all the steps like a system-wide CVS repository and allows you to step through changes right in your desktop.
One word: HAMACHI
1. Install it on your desktop and laptop 2. Create an network on your desktop and join it from Hamachi on your latop 3. Map a drive, ANY drive, from you home network, on your laptop 4. Use any of the plethora of free utilites on the web to sync your data to this drive, like Microsoft SyncToy (don't laugh, it's free and works).
The whole setup is free and easy to setup. N-joy.
Dude, VICEVERSA PRO 2.0 is the way to go for Windows to Windows. I researched backup solutions for months until I decided. I tried everything out there, not kidding. I it ALL. Trust me, download ViceVersa 2.0 Pro and your done. I am not affiliated with them in any way. DO NOT WASTE YOUR TIME like I did with the command line shit unless you have linux.
I use Nova Backup on my Windows machines combined with an external hard drive. On my linux machine I keep important things on a removable HD that I backup to DVD when necessary. Life would royally suck if my Ext. HD crashed.
Using rsync over ssh to backup Mac OS X can present a problem if the network storage device has a limitation on the number of characters for the full directory path for a file/folder. My network attached storage (a Thecus N2100) has a limitation of 250 characters for the full directory path for a file/folder. Quite a number of files/folders in my Mac exceeded the 250-character limitation.
No need to install cygwin on Windows. cwRsync will do just fine. More details here:
http://www.latestintech.com/2007/03/20/sync-windows-directories-with-linux/
wtf!?!? u say to use vi text editor when in the pic u use nano. WtF. nano is better in my opinion.
wtf!?!? u say to use vi text editor when in the pic u use nano. WtF. nano is better in my opinion.
I have been using Norton Ghost for yrs to back up all my data. Creating a ghost image of your system is real fast if there are not many media files. Other methods are also good but I prefer its simplicity and user friendly functions.
I just went through this backup phase a week ago. I've been really good about my backups (usb drive, laptop, extra hard drive, etc etc, but realized that say my house burns down (or something else drastic), my backups go with it. So I wanted a simple yet stable and safe solution for online backups. I have a webhost that gives me TONS of storage (dreamhost), so I researched and researched, finally I kept it very simple...
cwRsync + Winrar + batch file + task scheduler
cwRsync (rsync compiled for Windows)
Winrar (I use the command-line version)
Batch file (wrote a simple batch file myself to rar and rsync for me when run)
Task Scheduler (so Windows handles running the batch file automatically)
The first time it syncs to the server takes FOREVER (still syncing as we speak). I blame my upload speed (stupid Comcast), I figure I have a week or so left of uploading, but rsync will do it's magic with the subsequent syncs and should exponentially lower the upload time.
Winrar password+encryption is pretty secure, so I don't care if the sysadmin at my webhost gets a hold of my files either. Just make sure not to use compression when using rar in the batch file, rsync can't do it's magic that well on compressed file (can't find similarities).
Try some Simple Backup with that.
For multi-platform BackupPC (open source) will do the job without hassle - http://backuppc.sourceforge.net/
For Windows only there are numerous backup programs that support automatic backups to ftp ( http://www.google.com/search?hl=en&q=%22backup+software%22+%2B+%22ftp%22 )
You could also use Subversion, which has version control built in. That way you'll have a history of your files, changes made to them, etc as well.
If you just need a Windows file sync solution, checkout:
DSynchronize at http://dimio.altervista.org/eng/
All four of my PCs cross-synchronize their files each day.
It works great!
I don't believe your rsync script will delete old files on the server that have been deleted on your local machine. You need to --delete flag to do that, which incidentally, I haven't gotten to work over ssh, but works fine locally.
Let's not forget that the longer a hard drive is spinning, the more likely it is to crash. You are much better off backing up to an external drive that gets disconnected and thrown in a fire-proof box with the rest of your important stuff.
This article may be a bit too technical for engadget, but it's good to get people thinking about automatic backups.
It really doesn't matter what solution you use for the actual syncs, the important part is getting it set up and working automatically.
I've just written an article about general home backup strategies to keep your data forever, when you've absorbed everything in this article come check it out!
http://www.christophercamps.com/archives/how-to-keep-your-data-forever
This is one of the silliest articles engadget has ever run.
My backup strategy:
a) TrueImage mirrors the boot partition to six backups on both internal and external drives nightly (so I have perfect mirrors going back three days on both drives.
b) SecondCopy backs up several versions of all data files at frequencies from one to 60 days - if I want to know what was in a file a day ago, or 87 days ago, I can find it.
c) Mozy backs up all important data to the net every night.
Once I set it up, it requires no supervision at all, and doesn't slow the computer as it runs from 1 - 5am.
The solution in this article is absolutely nuts.
Interesting timing! I wrote a HOWTO on a backup strategy using rdiff-backup and launchd earlier this week.
You can read it here: http://n3wb.com/boolean/archives/2007/03/howto-rdiff-backup-with-launchd/
For my backup, I already do this, but rather than a shared account I use ExaVault ( http://www.exavault.com/ ) which offers more GB/$ than shared hosting (as well all the other dedicated offsite backup providers)
For you Windows users out there, keep you eye on Windows Home Server (WHS) - a new product coming out this fall that addresses the needs of home (and SOHO/SMB) non-technical users. The highly automated OS install asks less than a half-dozen very simple questions and connected Windows PCs are backed up automatically at regular, user-specified intervals. WHS also provides shared and private data storage on the server and a lot of other features. I'm beta testing the RC version now and expect to see it released to the public this fall.
More info at http://www.microsoft.com/windows/products/winfamily/windowshomeserver/default.mspx