Friday, 9 September 2011

Fully Automated Backups of QNAP-TS410

One of my clients has a couple of sites with a QNAP NAS device at each that they use as file servers.  These QNAPs have a built in "backup" facility that you can use to automate copying the data from one device to another.

Why did I put backup in quotes?  Because, this is not really backup, it is replication which is not the same thing at all.

To illustrate the difference, imagine you have a spreadsheet say, with important financial data in.  One day, one of your staff deletes a large chunk of the contents of this spreadsheet and saves the file.  A couple of days later you realise this and go to your backup to get an older version of the file from before the data was removed...

If all you have is a replica of the current files then you are properly stuffed.

Now this didn't actually happen to my client luckily but when I realised what the situation was (I had inherited the environment...) I knew I had to do something about it.

After much research I found there was no simple way to do this, but thanks to one article I found, and a lot of hard work on my part, I managed to piece together a full solution for automated daily backups (with reporting) using rsnapshot which I thought I would share in the hopes that it might help someone else out.

Many thanks go to http://www.sysnet.co.il/ where I found most of the ground work for this.  This article builds upon that one, detailing how to fix a few problems I encountered with these specific QNAPs, and also extends it to produce automated reporting on the success of the backups

http://www.sysnet.co.il/files/Daily%20incremental%20backups%20with%20rsnapshot.pdf

Installation of Packages

Log on to the web interface of the QNAP

  • Go to Applications… QPKG Plugins
  • Click GET QPKG
    • If this fails – “Sorry, QPKG information is not available”
      • Check DNS servers have been configured correctly under System Administration… Network
      • If still not working, try downloading the package from somewhere else and copying to the NAS as this is very internet speed dependent and prone to timing out.
  • Select the Optware IPKG and download the appropriate package
    • For the QNAP-TS410 devices this is the ARM (x10/x12/x19 series) for TS-419P
    • IPKG just allows to install a much wider range of packages onto the QNAP
  • Extract the zip file to get  a .qpkg file then click on INSTALLATION
  • Browse to the qpkg file and click INSTALL
    • If this fails then use scp to copy the qpkg file to the NAS and then install from the command line using 
    • sh Optware0.99.163_arm-x19.qpkg
      and this will show you more information about why it has failed
    • If you get an error saying:
      “Optware 0.99.163 installation failed. /share/MD0_DATA/optware existed. Please remove it first.”
      Then type the following command and try the install again:
      rm –rf /share/MD0_DATA/.qpkg/Optware
  • Once installation is successful, go to QPKG INSTALLED
  • Click on Optware and click ENABLE
  • Use putty (or whatever) to connect via SSH to the QNAP
  • Run the following commands to install the necessary packages
    • /opt/bin/ipkg update            (there will be some errors here but ignore them)
      /opt/bin/ipkg install rsnapshot
      /opt/bin/ipkg install nano

Rsnapshot Configuration
Configuration of rsnapshot is managed by editing the file /opt/etc/rsnapshot.conf
Initially this needs to be set up with the location on the file system to store the snapshots and a list of folders to be backed up.
Note that for remote backups the data is pulled from the server to be backed up NOT pushed to the backup server as you might expect.
To set the snapshot folder location, edit /opt/etc/rsnapshot.conf

nano /opt/etc/rsnapshot.conf

and change:

snapshot_root  /opt/var/rsnapshot
to
snapshot_root  /share/MD0_DATA/rsnapshot

Exclude any rsnapshot backups and other data you do not want backed up by adding the following:

exclude        /share/MD0_DATA/rsnapshot
exclude        /share/MD0_DATA/Backup
exclude        /share/MD0_DATA/Qdownload
exclude        /share/MD0_DATA/Qmultimedia
exclude        /share/MD0_DATA/Qrecordings
exclude        /share/MD0_DATA/Qusb
exclude        /share/MD0_DATA/Qweb

Change the intervals to keep however many backups you want (rsnapshot is very efficient at not duplicating data so I configure it for 3650 daily backups, or ten years’ worth) by setting the BACKUP INTERVALS section as follows:

#interval      hourly     6
interval       daily      3650
#interval      weekly     4
#interval      monthly    12

Comment out the default backup sets by putting a hash in front of these lines in the BACKUP POINTS / SCRIPTS section

#backup        /etc/      localhost/
#backup        /opt/etc/  localhost/

Add the following line to carry out the actual backup:

backup admin@<site1_ip_address>:/share/MD0_DATA/    <site1>

NOTE:  The fields in this file must be separated by tabs, not spaces

After making any changes to the rsnapshot.conf, run the following command to make sure that everything is OK

rsnapshot configtest

Setting Up SSH Keys
In order for the scheduled backups to be able to access the remote NAS device we need to set up SSH keys – this will allow the backup server to present a certificate to the remote server for authentication.
First copy the key from the backup server to the source server:

ssh admin@<site1_ip_address> "echo `cat ~/.ssh/id_rsa.pub` >> ~/.ssh/authorized_keys"

NOTE: the above command is all one line

Accept the authenticity of the host if prompted and then enter the admin password when prompted
Next set the permissions on the configuration file and add this to the autorun to ensure it is preserved following a reboot.
Login to the source NAS (Site 1) and enter the following commands:

nano /tmp/config/autorun.sh

Add these three lines to the file and save it:

mount –t ext2 /dev/mtdblock5 /tmp/config
chown admin.administrators /mnt/HDA_ROOT/.config
umount /tmp/config

Scheduling Rsnapshot
Cron is used to schedule the actual snapshots.  To configure this edit /etc/config/crontab so that it has the following rsnapshot entries:

0 23 * * * /opt/bin/rsnapshot daily

After making any changes to this file, make sure to run the following command to ensure the changes are preserved if the NAS is rebooted

crontab /etc/config/crontab

 

Email Configuration
To set up the NAS for email, use the QNAP web administration page and go to System Administration… Notification… and enter the settings for your mail server: 

You can send a test message from the ALERT NOTIFICATION tab to check this is working.

Setting Up Monitoring
The rsnapshot log file is stored here: /opt/var/log/rsnapshot

For automated alerting, create the following script, and then schedule it through crontab (make sure this script is stored under /share/MD0_DATA/… as files stored in /root for example, get removed on system reboot.)

/share/MD0_DATA/rsnapshot/check_backup.sh

#!/bin/bash
recipient=recipient@example.local
sender=sender@example.local
subject="Site 1 Backup Report"
log=/opt/var/log/rsnapshot
tmpfile=/opt/var/log/rsnapshot.tmp

function log {
  /bin/echo "[`date +%d/%b/%Y:%H:%M:%S`] $*\n" >> $log
}

if grep -q "rm -f /opt/var/run/rsnapshot.pid" $log
then
  if grep -q ERROR: $log
    then
      status=FAILED
    else
      status=SUCCESSFUL
  fi
else
  status="FAILED, INCOMPLETE"
# The following lines safely kill the backup to stop it
# impacting performance during the day
# comment them out if you do not want this to happen
  log "Backup window exceeded, terminating processes"
 
snapPID=`ps -ef | grep "/opt/bin/perl -w /opt/bin/rsnapshot daily" | awk '{print $1}'`

  rsyncPID=`ps -ef | grep "/opt/bin/rsync -a --delete --numeric-ids" | awk '{print $1}'`
  log "rsnapshot pid = $snapPID"
  log "rsync pid = $rsyncPID"
  log “Killing rsnapshot process”
  kill $snapPID
  log "Sleeping for 5 minutes to wait for rsnapshot to exit"
  sleep 300
  log “Killing rsync processes”
  kill $rsyncPID
fi

echo To: $recipient > $tmpfile
echo From: $sender >> $tmpfile
echo Subject: $subject - BACKUP $status >> $tmpfile
echo "Disk space used by backups: $(rsnapshot du | grep total)" >> $tmpfile
echo "Most recent backup sizes: " >> $tmpfile
echo "$(du -sh /share/MD0_DATA/rsnapshot/daily.0/*)" >> $tmpfile
echo "Backup Log:" >> $tmpfile

cat $tmpfile $log | /mnt/ext/usr/sbin/ssmtp -v $recipient

rm $log
rm $tmpfile


NOTE: lines in bold have been wrapped in this document – they must be all on one line in the actual script.

Make the script executable by typing the following command:

chmod +x /share/MD0_DATA/rsnapshot/check_backup.sh

Then schedule the script to run each morning by adding the following lines to the crontab:

0 7 * * * /share/MD0_DATA/rsnapshot/check_backup.sh

Restoring Data
Use the following command to show all the backups and when they completed:

[~] # ls -Ahlt  /share/MD0_DATA/rsnapshot/
drwxr-xr-x    3 admin    administ     4.0k Jul 10 23:21 daily.0/
drwxr-xr-x    3 admin    administ     4.0k Jul  9 23:40 daily.1/
drwxr-xr-x    3 admin    administ     4.0k Jul  8 23:41 daily.2/
drwxr-xr-x    3 admin    administ     4.0k Jul  7 23:37 daily.3/
drwxr-xr-x    3 admin    administ     4.0k Jul  6 23:29 daily.4/
drwxr-xr-x    3 admin    administ     4.0k Jul  5 23:27 daily.5/
drwxr-xr-x    3 admin    administ     4.0k Jul  5 04:59 daily.6/

In each of the daily folders there is a subdirectory for each server it has backed up and below that is a full replica of that server’s data.

Restoring files is simply done using the scp command on the backup NAS which has the following syntax:

scp <source_file> <site1_ip_address>:<destination>

If the source file/path has spaces in then enclose the full path in quotes.

So, to restore the file expenses.xls in the Contractors share from the 7th July, use this command:

scp “/share/MD0_DATA/rsnapshot/daily.2/<site1>/share/MD0_DATA/Contractors/expenses.xls” <site1_ip_address>:/share/MD0_DATA/Contractors

NOTE:  The above command is all one line.

Dealing With Overrunning Backups
If backups over run into the day this could affect performance of the NAS and internet access.  This is particularly likely when the system is first setup as there may be a lot of data to transfer.  Once the initial transfer has taken place only changes are sent so the backup is much quicker.

The check_backup.sh script automatically detects this and stops the backup running, but if you need to do it manually at another time then perform the following steps:

To check if a backup is still running, SSH to the NAS and type the following commands:

[~] # ps aux | grep rs
31897 admin   3256 S   /opt/bin/perl -w /opt/bin/rsnapshot daily
32219 admin   2140 S   /opt/bin/rsync -a --delete --numeric-ids
32220 admin   4748 S   /opt/bin/ssh -l admin <site1_ip_address>
32301 admin   1932 S   /opt/bin/rsync -a --delete --numeric-ids

If this shows any rsync or rsnapshot processes then the backup is still running.  To stop the backup kill the processes in this order:

/opt/bin/perl –w /opt/bin/rsnapshot daily
/opt/bin/rsync ….

To do this, use the command “kill” followed by the PID (process identifier) – this is the first number in the output of the ps command.  So for the example above, to stop the backup, issue these commands.

kill 31897
ps aux | grep rs

Wait for the rsnapshot daily process to exit – this may take a while so periodically run the ps command again to check if it is still there.  Once it has finished, then kill the first rsync process (this will stop the others as well)

kill 32219

NOTE:  Use the correct PIDs, don’t just copy the ones here !!!

If you kill the rsync process first then rsnapshot notices this and rolls back the snapshot that was in progress deleting all the data.  Any previous days’ data is still there but if the backup was overrunning then you do not want to have to start copying the new data all over again – by doing it this way round the backup can pick up where it left off next time.
Rsnapshot      http://www.rsnapshot.org/

12 comments:

  1. Thanks man, I was in the exact same situation. Very helpful article.

    ReplyDelete
  2. This is a great article but does this qpkg stop running network processes and open files. For example if the qnap is serving webpages this backup could get stuck on any currently opened files and eat it's own RAM to death trying to back them up.
    How is this problem addressed?

    ReplyDelete
    Replies
    1. Hi John, that is a good question.

      This solution uses rsync which doesn't have any trouble backing up open files per se (so no danger of running out of RAM or anything), but if the file is being actively written to when the copy happens then the data in the copied file will be inconsistent and so useless.

      This is an issue for any backup solution and the way around it is to freeze the data in some way while the backup runs to ensure you have a consistent snapshot of the file at a given point in time. There are various ways of doing this - on Windows there is VSS, on Linux you can use LVM snapshots, but at a basic level you should either stop any processes that are actively writing to files or use an application specific backup process to create a snapshot of the data you want to copy.

      For instance, with your example of serving web pages from the QNAP, if these web pages are static then you don't need to worry - rsync will copy these even if they are open at the time. If the web site is data driven and you have say a MySQL backend then you should use mysqldump to create a consistent copy of the database and include that copy in the backup script rather than the active MySQL database file.

      Hope this clears things up?

      Delete
  3. Nice article - does broadly what I need but if I run the backup from the command line I get prompted for the password on the remote system each time (and for each backup line in the .conf file) - thus the crontab job will presumably (?) fail.

    What am I doing wrong and is there a way of testing the SSH login without running a backup job?

    ReplyDelete
    Replies
    1. Hi Dave,

      If you are getting prompted for a password that means that the SSH keys are not working - go back over that section of the article, particularly checking that ~/.ssh/authorized_keys on the source server contains the backup server's key and that the file permissions have been set.

      You can test the keys are working by just SSHing to the source server from the backup server as the admin user:

      # ssh admin@

      If all is working, this should log you into the source server without prompting for a password

      Hope this helps

      Jon

      Delete
    2. True, because the rsnapshot has no right to read in ~/.ssh/

      therefore it cants access the private key file

      thats all about this issue.

      i moved the key and added in the

      rsnapshot.conf

      for

      ssh_args: -i /path/to/keyfile/keyfilename

      works just fine

      Delete
  4. I think I may have had a name in one place and an IP address in another but I'm not sure.

    In the end I installed openssh and followed the instructions at http://en.gentoo-wiki.com/wiki/Rsnapshot with some QNAP modifications for paths and such like - I like the idea of using a user with limitations to access the box remotely but it was a pain to set up and I wouldn't have started without your article and certainly not have got it working so thanks again :-)

    Now I need to implement your reporting script to let me know it works!

    ReplyDelete
  5. Hello,

    I am hoping you may find something simple I am missing. I have one QNAP NAS at my location and one in another state connected via VPN. My config has the following:

    NAS 1 my location: backup admin@10.0.16.4:/share/MD0_DATA/ OFFICENAS-FD
    NAS 2 in another state: backup admin@10.0.0.160:/share/MD0_DATA/ OFFICENAS-RE

    10.0.0.160 is the IP of NAS 1 and OFFICENAS-RE is the DNS name
    10.0.16.4 is the IP of NAS 2 and OFFICENAS-FD is the DNS name

    I have a file called Test-doc.txt on NAS 1 in the Public share and I was planning on changing the text every day to make sure the backups are working. On the first day (today) I noticed that the file was not backed up, or so I thought. On NAS 2 in the directory /share/MD0_DATA/rsnapshot/daily.0/OFFICENAS-RE/share/MD0_DATA/Public there are no files, however, on NAS 1 the file is showing in /share/MD0_DATA/rsnapshot/daily.0/OFFICENAS-FD/share/MD0_DATA/Public. It seems like the files are getting backed up but only locally, which makes no sense because network traffic is very high during the backups.

    ReplyDelete
    Replies
    1. Hi Anon,

      That does seem strange.

      I would check the logfile, or run rsnapshot daily manually and go through the output - might give a clue as to what is happening here.

      Apologies for the delayed reply
      Jon

      Delete
  6. Hi Jon and thx for your awesome article!
    The system is configured correctly, but I'm having a number of problems related to rsync: after the first two or three backups, I keep getting errors of this type "rsync error: error in rsync protocol data stream (code 12) at myself. c (605). "
    I do not know what to do!
    I also introduced the sync_first to avoid that every time I rotated incompleted backups.
    Do you think this problem could be related to qnap or hard drives that I have installed?
    thanks

    ReplyDelete
    Replies
    1. Hi there!

      That normally happens when the network connection drops for some reason - is this a local LAN backup or over the internet? I would check network is stable first, maybe try running the backup locally and see if you get the same errors?

      Jon

      Delete
  7. Sorry got it mixed up.
    go to target nas
    mount acording to this:
    http://wiki.qnap.com/wiki/Autorun.sh
    put this into the file:

    chown admin.administrators /mnt/HDA_ROOT/.config

    Safe
    close
    then unmount usinge

    unmount /tmp/config

    ReplyDelete