I maintain several personal websites, and felt they should be backed up in case my Xen host has issues.
Step 1
Create an Amazon S3 account at http://aws.amazon.com/s3/. Once your account is created, you will need to create ‘credentials’, which will allow us to authenticate with S3. You can access this by going to the “Amazon -> Account -> AWS Identity and Access Management” then clicking ‘Security Credentials’ on the left, and then creating a ‘Access Key’. These keys are composed of 2 parts: a public portion, called the ‘Access Key ID’, and a private portion (never to be shared) called the ‘Secret Access Key’.
Step 2
We need to install a program called ‘s3cmd’. This will allow us to interface with Amazon S3 via the command line. On Ubuntu:
sudo apt-get install s3cmd
Step 3
Now we need to setup s3cmd to save settings about our setup. Make sure you have the Access Key ID and the Secret Key. Run the following command to get started:
s3cmd --configure
From here you will get an interactive prompt:
Enter new values or accept defaults in brackets with Enter. Refer to user manual for detailed description of all options. Access key and Secret key are your identifiers for Amazon S3 Access Key: 231231232 Secret Key: 213123123 Encryption password is used to protect your files from reading by unauthorized persons while in transfer to S3 Encryption password: ubuntu Path to GPG program [/usr/bin/gpg]: When using secure HTTPS protocol all communication with Amazon S3 servers is protected from 3rd party eavesdropping. This method is slower than plain HTTP and can't be used if you're behind a proxy Use HTTPS protocol [No]: yes New settings: Access Key: 231231232 Secret Key: 213123123 Encryption password: ubuntu Path to GPG program: /usr/bin/gpg Use HTTPS protocol: True HTTP Proxy server name: HTTP Proxy server port: 0 Test access with supplied credentials? [Y/n]
I chose to pick “Use HTTPS protocol”, which will upload it via a secure method. This is a good idea, although will slightly impact performance and may use slightly more traffic. In addition, s3cmd also will encrypt the files using gpg, which means that if someone broke into your s3 account, they would still need that pass phrase to decrypt your data.
Step 4
We can now test s3cmd and try to upload a file. You will need to create a ‘bucket’, which is where our files for this project are stored. You can have many buckets, so if you want to separate your projects you could create additional ones for each one. When we make a bucket name, they are globally visible in S3, so you will want to pick something not likely to be taken:
s3cmd mb s3://sharms.org-wordpress-blog
If that command runs successfully, we now have a new bucket called ‘sharms.org-wordpress-blog’. If not, pick a different name and try again. Now we can test uploading a file:
s3cmd put /home/sharms/testfile.txt s3://sharms.org-wordpress-blog # Verify its where we think it is s3cmd ls s3://sharms.org-wordpress-blog
Step 5
Using bash, we can automate this, and backup all of our files, daily, weekly, monthly etc. Here is an example, which I put at ‘/usr/local/bin/backup_blog_to_s3.sh’:
bucket="s3://sharms.org-wordpress-blog"
logger -t backup_blog_to_s3.sh "Backing up sharms.org blog to S3"
cd /var/www
tar -cf sharms.org.tar blog
bzip2 -9 sharms.org.tar
s3cmd put sharms.org.tar.bz2 ${bucket}
rm /var/www/sharms.org.tar.bz2
logger -t backup_blog_to_s3.sh "Backing up MySQL database to S3"
mysqldump sharms-wordpress -u databaseuser -p databasepassword -a -r sharms-wordpress.sql
bzip2 -9 sharms-wordpress.sql
s3cmd put sharms-wordpress.sql.bz2 ${bucket}
rm sharms-wordpress.sql.bz2
You can see from the example that we backup all of the files in the ‘blog’ directory, and export all of our data out of a MySQL database. You can even change the file names so they have the date when they were backed up:
tar -cf sharms.org-wordpress-$(date +%d%m%y) blog
Running Automatically
If we wanted to backup the system every day, this is very easy:
sudo cp /usr/local/bin/backup_blog_to_s3.sh /etc/cron.daily sudo chmod 755 /etc/cron.daily
Security Notes
When considering this setup, you are most vulnerable to someone obtaining access to your server, and getting your Amazon keys. You can always revoke them from the Amazon Webservices Control Panel, but you don’t want an attacker using your S3 account for nefarious means. Beyond the scope of this document, you could setup a user called ‘backups’, and make the file ‘~backups/.s3cmd’ with the permissions ’600′, to stop other users from looking at it’s contents.
Related posts:

#1 by Matt Copperwaite (yaMatt) on January 13, 2011 - 10:20 am
Quote
Which version of Ubuntu is this for?
#2 by admin on January 13, 2011 - 3:22 pm
Quote
It should work on any system Hardy (8.04) and above — these steps will also work on OpenSUSE / Arch / Fedora etc, just change the apt-get part to that systems packager.
#3 by Ivan on January 13, 2011 - 6:47 pm
Quote
Thanks for the article. I have mainly been using Tarsnap but I give this a try.
#4 by filterfish on January 13, 2011 - 8:20 pm
Quote
You’d be better off using duplicity.
#5 by AML Exam on January 14, 2011 - 6:17 am
Quote
I’m not used to Ubuntu thanks for sharing this info.
#6 by stoz on January 29, 2011 - 5:30 pm
Quote
Hi, nice article.
Getting this errors:
************************************************************
303104 of 5781045 5% in 1s 195.29 kB/s failed
WARNING: Upload failed: / ([Errno 32] Broken pipe)
WARNING: Retrying on lower speed (throttle=0.01)
WARNING: Waiting 3 sec…
still figuring it out.
#7 by stoz on January 29, 2011 - 5:43 pm
Quote
it looks like something is messed up at the moment. cannot create bucket there. Cannot copy data from ec2 ireland to us. stucked.
#8 by cutiepink on May 12, 2011 - 12:19 pm
Quote
For me the only issue I have found is that I’m also using Jungle and it puts some weird information in the file names so it is hard to figure out what the file is outside of Jungle. I pinged the guys from SMEStorage and they said they would have a look at how the could parse this to make an import easier…thank you..
#9 by Sally on September 9, 2011 - 11:36 pm
Quote
I use WordPress and have been using a plugin to do this. Is there an additional benefit to doing it this way, or is this simply a way to do it without WordPress?