Rsync host setup
File Protection can use an rsync host as a backup destination. This allows you back up data across the internet to any rsync host server. The rsync host can be supplied by a 3rd party or you can set up your own rsync host server.
Adding rsync backups to your backup strategy is an excellent way of insuring yourself against data loss. Critical files can be copied to a secure, offsite location, away from your office, and backing up across the internet overcomes the need to swap tapes or hard drives. You can also use built-in rsync encryption to protect the data on the rsync host, and the data can be accessed from wherever you are, using BackupAssist.
Rsync efficiently manages bandwidth and data allowances and will only transfer data that has changed. This means your internet backups will take less time when compared other remote backup methods such as FTP.
Rsync is an open source application used to synchronize files and directories from one location to another. BackupAssist’s implementation of this technology is in the form of an rsync destination that allows File Protection backup jobs to back up data across the internet. The data transfer is minimized because only the data that has changed is transmitted and all data packets are compressed.
Rsync uses a checksum method to perform the bit level data transfer. Rsync checks whether any data has changed by looking at the size of a file and its modification date. If no data has changed, rsync will not transfer the data, saving time and bandwidth. If files do not match, rsync uses a checksum method called a rolling checksum on the changed file to see what has changed. It will then transfer only the altered or appended data within the file.
Rsync terminology:
- Data Host - The remote machine that will be used as your backup destination.
- Rsync Server - The same as the data host, but specifically referring to the machine running rsync that accepts incoming connections and data from rsync clients.
- Rsync Client - A machine containing your working data that has BackupAssist installed. BackupAssist comes packaged with the rsync libraries necessary to transfer data to the rsync server during a backup.
To help better understand how rsync transfers work, let’s take a look at a hypothetical three day backup scenario.
The scenario examines three different backup methods: rsync, FTP and incremental drive imaging.
Day 1: We begin with a 4GB data file backup.
Looking at this first backup we see that for the initial data transfer there is a 100% transfer for both Incremental drive imaging and for FTP. Thanks to rsync’s packet compression we see a 50% reduction in the initial transfer. Depending on your rsync server’s setup this initial overhead can be removed by seeding your backup server locally, a method we will discuss later in this paper.
Day 2: On the second day we have added a further 0.1 GB to the start our data file.
We can see that both FTP and incremental drive imaging perform a full backup of the file. Rsync only backs up the changed data within the file, and compresses the sent data, resulting in a 50mb transfer.
Day 3: This day no data has been added, but data has been shifted within the file.
Rsync is able to recognize that the data is already on the backup server and will reorganize the file with a minimal instruction file. Incremental drive imaging is also aware that the data was moved, however it must re-backup the moved data as this section does not match the data source. FTP once again has to do a full backup of the source data.
Summary
As demonstrated in this example, rsync delivers substantial performance gains. With the ability to check what data is still the same, then append, remove or modify it as necessary to match the local source it can greatly reduce backup overhead.
The rsync destination that you use can be either an rsync server that you maintain yourself or a third party destination that supports rsync. Third party destinations include IT service providers, data centers, ISPs and cloud providers. These solutions have the advantage of high availability networks with saleable storage.
BackupAssist's implementation of rsync
BackupAssist implements rsync as a destination for File Protection backups. When you create the backup job, rsync is selected as a destination in the Destination step, and configured in the Set up destination step. This section explains how File Protection features apply to rsync and how BackupAssist works with rsync's features.
To learn how to make an rsync backups, see Rsync backups
Data compression and encryption
BackupAssist supports encryption and compression on the server. BackupAssist for rsync offers industry standard encryption for data stored on the data host. This means that your data is safe “in the cloud”, making external hosting a safe and secure option. Your files are also automatically compressed on the Data Host, which reduces the amount of disk space used on your hosting company.
Rsync for BackupAssist uses four types of compression:
- Effective transfer compression by only sending changed data.
- All data packets are compressed and encrypted during transfer.
- Single Instance Store (SIS) uses hard link technology to prevent the same files from being stored more than once across backups on your host.
- The source data is encrypted and compressed in an rsync-friendly way before transmission, effectively minimizing the space used by files on the server even further.
Note: If you enable or disable encryption for an rsync job, BackupAssist will need to re-seed the backup to the host with a full set of data (i.e. the next backup will be a full backup).
Single-Instance store
File Protection backups cannot use single-instance store when the backup is saved on a ReFS formatted rsync destination. This means all the data will be backed up each time the backup job runs.
Preservation of file attributes
Because rsync works on top of the Cygwin Unix emulation layer, it does not recognize Windows file attributes (e.g. read-only, hidden), NTFS security attributes (i.e. access control lists), NTFS alternate data streams or file creation times. The only file system attribute preserved when using rsync to transfer data is the Last modified time attribute.
BackupAssist’s implementation of rsync overcomes this limitation by having the option to store NTFS metadata on the backup destination. This option is enabled in the Manage screen under Rsync options. This is checked by default for new jobs created in BackupAssist. If enabled, NTFS streams and security data will be saved to a separate file on the destination and then added back to the file as part of the restore process, when using the Integrated Restore Console. So while these attributes are not "preserved" on the files backed up to your rsync destination, they will still be restored.
This table outlines what attributes are preserved with the NTFS metadata option:
File attributes at destination | Preserved |
Windows File Attributes | No |
Creation time | No |
Last access time | No |
Last modified time | Yes |
NTFS security (Access Control Lists) | Yes |
NTFS alternate data streams (ADSs) | Yes |
Using rsync for files and directories
Rsync performs best when working directly on the file system, backing up normal files and directories. It is able to identify which of the 50 files have changed, and for those files, it determines what changed. It calculates checksums on 50MB of data, and can complete the backup in a matter of minutes. The amount of data transferred will be around 20MB for typical documents.
Below is an example of how rsync performs:
- File system with 50,000 files, 50 GB total
- 50 files of total size 50 MB have changed
Compare this to a scenario where you use a File Protection backup job to transfer an image file created by a System Protection backup, to an rsync destination. In this example, the VHD is 50GB.
Rsync will detect that the single VHD file has changed, and needs to determine the in-file deltas. It needs to calculate checksums on 50GB of data, which may take hours. Additionally, we have found that even if the underlying file system changes very little, about 10% of a VHD file changes from day to day and needs to be transferred. So, about 5GB will be transferred.
We see here that it is greatly preferable in terms of bandwidth and CPU time the operate rsync on the underlying file system rather than a backup of that file system.
File size and the number of files
In theory, there is no limit to the number of files or directories that you can rsync. Even though rsync only transfers the data that has changed, it still must read all of the data in the file set to check what data has changed. This makes rsync internet backups a disk/CPU intensive operation that can take longer the more your data grows, no matter how little data has actually changed.
Show more
We recommend that wherever possible, you use one of the other backup methods provided in BackupAssist (such as BackupAssist’s File Archiving) to regularly archive infrequently used data, so the amount of actual data in day to day use is minimized.
We have run tests on several different file systems – a typical file system of 70,000 files and 24 GB with fewer than 50 MB of daily changes can be synced in around 10 minutes. The largest file system we’ve tested is of 200,000 files and 100 GB, which took 20 minutes to sync minimal changes.
SSH Authentication
For SSH communication, we use a public / private key method of authentication, meaning that you will only be asked for your password once (when registering with the server), and your public key will be uploaded to the server, enabling BackupAssist to log into the server in the future in a secure, password-less manner. For more information on public / private key authentication, visit the following Wikipedia article: Wikipedia Public Key Cryptography
Daemon Authentication
In Daemon mode, your password is stored in an encrypted format by BackupAssist and provided every time the backup runs. When running in Daemon mode, traffic will be unencrypted. For this reason, we recommend that you only use closed network environments, such as LANs or WANs connected by a secure VPN. VPNs inherently encrypt communication between nodes, so using rsync in Daemon mode over a VPN is still secure.
Do-it-yourself rsync hosting
Any rsync Server such as an rsync-enabled NAS device, a Windows Server or a Unix machine can be used to store backups using rsync. The do-it-yourself approach has the advantage of keeping data in your control and a lack of monthly hosting fees.
Rsync servers can be one of two types:
- Rsync over SSH (preferred) runs rsync via a secure shell (SSH, port 22) which means all traffic over the internet is encrypted. User access control is modified by editing user accounts on the server.
- Daemon mode runs rsync as a normal TCP/IP service. User access control is modified by editing the rsync.conf file. Internet traffic is not encrypted. To learn more, review our online article Configuring BackupAssist for rsync without SSH, under the section, Altering the rsyncd.conf file.
Note: Windows and Linux data hosts support rsync over SSH. However, some NAS devices do not, and Daemon mode must be used instead. Daemon mode is still an acceptable solution provided a secured LAN/WAN (such as site-to-site VPN) is used.
How to set up a Windows rsync host
To set up a Windows machine to act as an rsync server, you will need to install both SSH and rsync on your Windows Server. We recommend CopSSH and cwRsyncServer. An installer for each can be found on our here on our website.
Step 1 Install Rsync
To set up a Windows rsync host requires a Windows Server 2008 or later machine with network connectivity and space to store backup data.Windows Small Business Servers (SBS) should not be used as rsync hosts.
The first step is to download and install rsync on the server that will act as the rsync host server that you back up.
- Run the cwRsyncServer installer.
- Continue through the installation wizard, installing the package to a location of your choice.
- During the installation, you will be presented with the popup on the right. We suggest leaving the SvcCWRSYNC account as is. Write down the password provided.
- Click Install to install the package. Once this is finished cwRsync will be present on your system.
Step 2 Install CopSSH
Once rsync is installed you will need to install CopSSH. CopSSH is used to open the secure tunnel between the BA machine and the Rsync server
- Run the CopSSH installer.
- Continue through the installation wizard, installing the package to a location of your choice.
- During the installation you will be presented with the popup on the right. We suggest leaving the SvcCOPSSH account as is. Write down the password provided.
- Click Install to complete the process of installing CopSSH on your system.
- During the Activate user part of the installation, you will be presented with a popup showing the service status and any active connections. At any time after the install you can access Activate a user from your start menu to allow SSH access to that user. You must activate at least one user before you can register an rsync client.
- Click OK to continue your installation.
Step 3 Activate a user
If you are planning to use SSH, you must activate a user with CopSSH before you register a BackupAssist client with your rsync server
- In the Start menu, under All Programs -> select CopSSH. The CopSSH Control Panel will open.
- To start the process to activate a user, click on the Users tab across the top of the user interface.
- Click on the Add button to bring up the wizard to activate a user.
- Click Forward on the opening screen.
- On the second screen, select the Domain and type in the user which you wish to activate. Click Forward once complete (admin is a manually created account we’ll use for this example).
- Change the Access Type to Linux Shell and Sftp using the drop-down menu. Leave all Options enabled as they are by default.
- On the fourth screen, click on Apply to complete the wizard and activate the user.
Warning: Do NOT activate using your administrator account. Doing so will cause a lock down on the account due to CopSSH’s security settings. We recommend activating a newly created account.
The user should now be showing as activated within the CopSSH Control Panel.
Your user’s home directory will be located at (for example) C:\Program Files\ICW\home\user.
The location of this directory can be changed by editing the file C:\Program Files\ICW\etc\passwd.
Note: If you uninstall rsync, the Windows service users SvcCOPSSH and SvcCWRSYNC are not removed. If you then re-install rsync the Windows users cannot be recreated because the passwords will not match, so the COPSSH and rsync services will not start. The fix is to uninstall and remove the users manually then re-install and add the users again with known passwords.
How to set up a Linux rsync host
Most FreeBSD and Linux servers can be used to host backup data. BackupAssist has two requirements: that the data host has an SSH server and rsync installed. All major Linux distributions (such as Fedora, RedHat Enterprise, Ubuntu, Debian) have these two prerequisites available as install options. The most common SSH server is OpenSSH.
To determine if your system has the prerequisites installed, log into your system, start a shell and type:
man rsync – this should return the man page for rsync if installed. Type ‘q’ to exit the man page.man sshd – this should return the man page for sshd if installed. Type ‘q’ to exit the man page.
You should use your distribution’s software package manager to install these packages, if they are not already installed. Most commonly they can be found under the Server or Security categories. The next step is to create logons on your data host. We recommend creating a separate logon for each client. For example, if you host data for 5 different companies, create 5 different accounts so that each company will only be able to see their own data. You should also make sure that each client’s home directories are on a partition that contains sufficient space to host their data.
You must also change the permissions on each user’s home directory, otherwise most SSH daemons will not allow you to connect to the server using the public/private key method (which BackupAssist uses). To do this, use the chmod command – for example for a user “fred”, type in the following (when logged on as root): chmod 700 /home/fred
Note: You can run rsync as a daemon on your Linux server. (For security reasons, we do not recommend this – use rsync over SSH instead.) If you choose to run rsync in daemon mode, you will not need to have the SSH service installed. For instructions on setting up BackupAssist to connect to an rsync daemon please view the How to set up a NAS rsync host section below.
How to set up a NAS rsync host
Backing up to an rsync-enabled NAS can be a very effective solution. The advantage of using a NAS is that, as an appliance, it can be close to a turnkey solution and easier to manage. Each NAS is different and some support rsync over SSH, whereas others only support rsync Daemon mode. There is however a list of requirements that must be met in order for BackupAssist to connect to the device.
To use your NAS as an rsync data host you will need:
- A NAS that is running rsync as a daemon, or one that has rsync and an SSH service running.
- Setup a share to act as a root directory for your rsync backups and allow read and write permissions to that directory.
- If your NAS requires a password to connect to the rsync service, you will need BackupAssist to authenticate to it.
- Your NAS will need to have the correct ports open for your rsync Daemon or SSH service (873 and 22 respectively).
Many dedicated NAS devices offer built-in support for rsync. While this can be convenient to set up, many of these devices use low-powered processors which can result in reduced performance if you are backing up large files (several GB or larger in a single file). The options available with these devices vary, and you will need to consult the vendors manual when setting up the NAS destination.
NAS Vendors that support rsync include QNAP, drobo, NETGEAR and Synology.
NAS hosting tips
Make sure the hardware is rsync compatible
When you select hardware to use as an rsync server, make sure the hardware can support the rsync protocol. If you select a Windows system, it must be able to run cwRsync. A NAS device must have rsync specified as one of the protocols supported. If in doubt, ask your hardware vendor for confirmation.
Processing speed is important!
Rsync can be a very processing intensive protocol - it uses checksums that calculate what data needs to be transferred. A lot of NAS devices come with lower range CPUs built-in. This will affect the overall time taken to complete an rsync backup.
Ensure there is plenty of disk space available
Although you may think you have enough disk space available when you first implement your rsync solution, a common cause of rsync problems is that the storage space eventually runs out. Some of the BackupAssist backup schemes are designed to retain significant amounts of data – meaning the space you have can be used up faster than you expect! Running out of disk space is a common problem and it can cause a lot of problems when it occurs. For this reason, the available storage space on your rsync host should be monitored.
Make sure you set the correct backup path
Some NAS devices contain a boot partition (similar to Windows Server 2008R2). Sometimes, if you enter the incorrect path your rsync backup will write to this boot partition – which could in turn cause major issues with your backup and hardware.
Seed your backup
If you’re planning on using a NAS device, you can run your seed backup by connecting your NAS device directly to the local network. This avoids having to seed to a USB drive, and then running the seed to the NAS device in a two-step process (saving you a lot of time).
Double check permissions
Even though you are logged in as a Domain Admin, most NAS devices require users to be set up locally within the unit and have permissions configured locally as well. If you receive permission issues, this is usually the reason as to why.
Backupassist rsync daemon mode
BackupAssist provides two options for running backups over an Rsync connection: Rsync over SSH and Rsync Daemon. Rsync over SSH runs Rsync via a secure shell (SSH, port 22), which means that all traffic over the internet is encrypted. User access control is modified by editing user accounts on the server. Daemon mode runs Rsync as a normal TCP/IP service and internet traffic is not encrypted. User access control is modified by editing the Rsync.conf file.
Both Windows and Linux data hosts support Rsync over SSH. However, some NAS devices do not, and where this is the case, Daemon mode must be used instead. Even though your data is not encrypted, Daemon mode is still an acceptable solution provided a secure LAN/WAN (such as site-to-site VPN) is used.
When using Daemon mode, a common issue is that the process is not fully automatic - unlike Rsync over SSH, you can't just specify a directory to write to and away it goes. Rsync Daemon's require a module to be set up first to make it possible to specify the backup directory.
Prerequisites
Before attempting to create a Daemon module you must make sure that cwRsync is installed on the Rsync destination. You can download a copy of cwRsync from the BackupAssist website. In addition, the steps to install cwRsync successfully are located in the sections above.
Altering the rsyncd.conf file
First, you need to create the Daemon module to be able to connect to it. This is done by altering the rsyncd.conf file, which is located in C:\Program Files (x86)\ICW by default.
Open this file in a text editor, such as Notepad, and enter something similar to the following:
[test]
path = /cygdrive/c/backup_directory
read only = false
transfer logging = yes
The meaning of each line is as follows:
[test] - This is the name of the module you're creating. (It can be set to anything you choose).
path - This is the directory that you want to back up to on the Rsync server.
read only - This sets the 'path' (see above) to either be read/write or read only. Since we want to write the backup to this directory, it should always be set to 'False'.
transfer logging - Allows any transfer information to be logged for future reference. This can be set to either Yes or No.
Once you have finished editing the rsyncd.conf file, save it to the following location: C:\Program Files (x86)\ICW\etc.
Starting the Rsync Daemon on the Rsync Destination
Next, you need to make sure that the Rsync Daemon service is started correctly. This is the service used to control the Daemon connection and if not started correctly can cause connection errors.
To start the Rsync Daemon, follow these steps:
- Open a command prompt window by browsing to Start > Run > and typing cmd in the text field provided.
- In the command prompt Window, change directory by typing CD "C:\Program Files\ICW\bin" or CD "C:\Program Files (x86)\ICW\bin" on 64-bit machines.
- Start the Rsync Daemon by entering Rsync --daemon:
The Rsync Daemon should now be started and BackupAssist should be able to connect to it successfully.
Configuring BackupAssist to use the Rsync Module
The next step is to configure the BackupAssist client to perform a backup using Rsync. Install BackupAssist v5.1 or later2. You will have a free 30 day trial, but beyond this trial period, you will need to purchase a license for "BackupAssist" and "Cloud Backup add-on" to continue using it.
1. 2. 3. Select Rsync as your job type and click Next
- To create a new BackupAssist backup job:
- Launch the BackupAssist console and choose File protection
- After making your backup selections, click Next
- Select Rsync as your job type and click Next
In the Rsync Server options section (see screenshot below):
- Enter your Rsync server name (or IP address), and choose Rsync Daemon.
- Under Path on server, type in the name of the Rsync module you created earlier (in this example 'Test').
- Enter the username and password that you want to connect to the Rsync server with.
- Click the Test connection... button to test communication with the Rsync server.
If you've performed each step correctly, you should the test connection should pass and you will be on your way to performing backups using Rsync with BackupAssist!
If you have any questions about BackupAssist or setting up an Rsync backup, please contact us via email at support@backupassist.com. We'll be happy to help!
Notes
BackupAssist for Rsync is also available as a Standalone license for those wishing to perform Rsync/Internet backup only. A BackupAssist base license is not required if you are using a BackupAssist for Rsync Standalone license. You can read more about the difference between the two BackupAssist for Rsync license types here.
If you plan to purchase a new BackupAssist for Cloud Backup add-on license, you will require a BackupAssist base license for this operation. Download a free trial at www.backupassist.com. Or upgrade your existing base license at http://www.backupassist.com/purchasing/purchase.php
Troubleshooting
Below are three common rsync host setup errors and how to resolve them.
- Test connection failed : Ensure that you are able to ping your rsync server from your BackupAssist server and that you have opened up the appropriate ports on your firewall. Make sure that the username can access the path you have specified.
- SSH Connection Refused : Ensure that the services Openssh SSHD and RsyncServer are started on the data host machine (Administrative Tools > Services). Make sure your firewall is not blocking the attempt.
- Register with server failed : Ensure that you have the correct username and password set up on your rsync server.