Excellent software and practical tutorials
RsyncDetailed explanation
1. What is Rsync
Rsync(remote synchronize) is a remote data synchronization tool that can quickly synchronize files between multiple hosts through LAN/WAN. Rsync uses the so-called "Rsync algorithm" to synchronize files between the local and remote hosts. This algorithm only transfers the different parts of the two files instead of transferring the entire file every time, so the speed is quite fast.
Rsync was originally a tool to replace rcp. It is currently maintained by rsync.samba.org, so the format of the rsync.conf file is similar to the main configuration file of samba. Rsync can be used through rsh or ssh, and can also be run in daemon mode. When running in daemon mode, the Rsync server will open a port 873 and wait for the client to connect. When connecting, the Rsync server will check whether the password matches. If the password verification passes, the file transfer can begin. When the first connection is completed, the entire file will be transferred once, and only incremental backups will be required in the future.
Rsync supports most Unix-like systems, whetherLinux, Solaris, and BSD. In addition, it iswindowsThere are also corresponding versions for the platform, such as cwRsync and Sync2NAS.
The basic features of Rsync are as follows:
- The entire directory tree and file system can be mirrored and saved;
- It is easy to maintain the original file permissions, time, software and hard links, etc.
- No special permissions are required to install;
- Optimized process, high file transfer efficiency;
- You can use rsh, ssh and other methods to transfer files, or you can use a direct socket connection;
- supportanonymoustransmission.
2. Rsync synchronization algorithm
The reason why Rsync can synchronize files very quickly is that the "Rsync synchronization algorithm" can calculate the data that needs to be backed up in a very short time. The description of Rsync's synchronization algorithm is as follows:
Assume that similar files A and B are to be synchronized between two computers, No. 1 and No. 2, where No. 1 has access to file A and No. 2 has access to file B. And assume that the network bandwidth between hosts No. 1 and No. 2 is very small. Then the rsync algorithm will be completed through the following five steps:
- Number 2 splits file B into a set of non-overlapping data blocks of fixed size S bytes, the last block may be smaller than S.
- No. 2 performs two checks on each segmented data block: one is a 32-bit rolling weak check, and the other is a 128-bit MD4 strong check.
- No. 2 sends these verification results to No. 1.
- No. 1 searches all data blocks of size S in file A (the offset can be arbitrary and does not have to be a multiple of S) to find a data block with the same weak checksum and strong checksum as a block in file B. This work can be done quickly with the help of the rolling checksum feature.
- No. 1 sends No. 2 a series of instructions to generate a backup of file A on No. 2. Each instruction here is either a proof that file B already has a certain data block without retransmission, or a data block that definitely does not match any data block in file B.
3. Rsync parameter description
3.1 rsyncd.conf configuration file
Global Parameters
All parameters before [module] in the file are global parameters. Of course, you can also define module parameters in the global parameter part. In this case, the value of the parameter is the default value for all modules.
port: Specifies the port number used by the background program. The default is 873.
motd file: The "motd file" parameter is used to specify a message file.serverThe contents of this file are displayed to the client. By default, there is no motd file.
log file: "log file" specifies the log file of rsync instead of sending the log to syslog. For example, it can be specified as "/var/log/rsyncd.log".
pid file: Specifies the pid file of rsync, usually specified as "/var/run/rsyncd.pid".
syslog facility: Specifies the message level when rsync sends log messages to syslog. Common message levels are: uth, authpriv, cron, daemon, ftp, kern, lpr, mail, news, security, sys-log, user, uucp, local0, local1, local2, local3,local4, local5, local6, and local7. The default value is daemon.
Module parameters
It mainly defines which directory of the server is to be synchronized. Its format must be "[module]". This name is the name seen on the rsync client, which is a bit like the share name provided by the Samba server. The data that the server actually synchronizes is specified by path. We can specify multiple modules according to our needs. The following parameters can be defined in the module:
comment: Assign a description to the module. This description, along with the module name, is displayed to clients when they connect to get a list of modules. No description is defined by default.
path: specifies the directory tree path for backup of this module. This parameter is required.
use chroot: If "use chroot" is specified as true, rsync will first chroot to the directory specified by the path parameter before transferring files. The reason for doing this is to achieve additional security protection, but the disadvantage is that it requires root permissions and cannot back up directory files pointed to by symbolic links pointing to the outside. The default value of chroot is true.
uid: This option specifies the uid that the daemon should have when the module transfers files. Used with the gid option, it can determine which files can be accessed and what permissions they have. The default value is "nobody".
gid: This option specifies the gid that the daemon should have when this module transfers files. The default value is "nobody".
max connections: Specifies the maximum number of concurrent connections for this module to protect the server. Connection requests exceeding the limit will be notified and tried again later. The default value is 0, which means there is no limit.
list: This option specifies whether the module should be listed when a client requests a list of available modules. If this option is set to false, hidden modules can be created. The default value is true.
read only: This option sets whether clients are allowed to upload files. If true, any upload request will fail. If false and the server directory read and write permissions allow, then the upload is allowed. The default value is true.
exclude: Used to specify multiple files or directories (relative paths) separated by spaces and add them to the exclude list. This is equivalent to using --exclude in the client command to specify the pattern. A module can only specify one exclude option. However, it should be noted that this option has certain security issues. It is very likely that the client will bypass the exclude list. If you want to ensure that specific files cannot be accessed, it is best to use it in conjunction with the uid/gid option.
exclude from: Specifies a file name that contains the exclude pattern definition. The server reads the exclude list definition from this file.
include: used to specify that the files or directories that meet the requirements are not excluded. This is equivalent to using --include to specify the pattern in the client command. Combining include and exclude can define complex exclude/include rules.
include from: Specifies a file name that contains the definition of the include pattern. The server reads the include list definition from this file.
auth users: This option specifies a list of usernames separated by spaces or commas. Only these users are allowed to connect to the module. The users here have nothing to do with system users. If "auth users" is set, then the client's connection request to the module will be challenged by rsync for identity verification. The challenge/response authentication protocol used here. The user's name and password are stored in plain text in the file specified by the "secrets file" option. By default, you can connect to the module without a password (that is, anonymous mode).
secrets file: This option specifies a file containing username:password pairs. This file is only used if "auth users" is defined. Each line of the file contains one username:passwd pair. Generally speaking, passwords should not exceed 8 characters. There is no default secures file name, so you must specify one (for example: /etc/rsyncd.passwd). Note: The permissions of this file must be 600, otherwise the client will not be able to connect to the server.
Strict modes: This option specifies whether to monitor the permissions of the password file. If the option value is true, the password file can only be accessed by the user running the rsync server, and no other user can access the file. The default value is true.
hosts allow: This option specifies which IP clients are allowed to connect to this module. Client mode definitions can be in the following forms:
A single IP address, for example: 192.167.0.1
The entire network segment, for example: 192.168.0.0/24, or 192.168.0.0/255.255.255.0
Multiple IPs or network segments need to be separated by spaces, and "*" means all. By default, all hosts are allowed to connect.
hosts deny: Specifies machines that are not allowed to connect to the rsync server. This can be defined using the hosts allow definition. By default, there is no hosts deny definition.
ignore errors: Specifies that rsyncd ignore IO errors on the server when determining whether to run the delete operation during the transfer. Generally speaking, rsync will skip the --delete operation when an IO error occurs to prevent serious problems caused by temporary resource shortages or other IO errors.
ignore nonreadable: Specifies that the rysnc server completely ignores files that the user does not have access rights to. This makes sense for situations where there are files in the directory that needs to be backed up that should not be obtained by the backup user.
lock file: Specifies the lock file that supports the max connections parameter. The default value is /var/run/rsyncd.lock.
transfer logging: Causes the rsync server to use ftp-formatted files to record download and upload operations in their own separate logs.
log format: This option allows users to customize the fields of the log file when using transfer logging. The format is aStringThe following format specifiers are available:
%h Remote host name
%a Remote IP address
%l File length in characters
%p The process id of this rsync session
%o Operation type: "send" or "recv"
%f File name
%P module path
%m Module Name
%t Current time
%u Authentication user name (null for anonymous)
%b Number of bytes actually transmitted
%c When sending a file, this field records the checksum of the file
The default log format is: "%o %h [%a] %m (%u) %f %l". Generally, "%t [%p] " is added to the head of each line. A perl script called rsyncstats is also released in the source code to count log files in this format.
timeout: This option allows you to override the client-specified IP timeout. This option ensures that the rsync server does not wait forever for a crashed client. The timeout is in seconds, and 0 means no timeout is defined, which is the default value. For anonymous rsync servers, an ideal number is 600.
refuse options: This option allows you to define a list of command parameters that clients are not allowed to use for this module. You must use the full command name here, not the abbreviation. However, if a command is refused, the server will report an error message and exit. If you want to prevent compression from being used, it should be: "dont compress = *".
dont compress: Used to specify files that are not compressed before transmission. The default value is *.gz *.tgz *.zip *.z *.rpm *.deb *.iso *.bz2 *.tbz
3.2 Rsync Command
After the rsync server configuration is completed, the next step is to issue the rsync command on the client to back up the files on the server to the client. rsync is a very powerful tool, and its command also has many functional options. We will analyze and explain its options one by one below.
The command format of Rsync can be the following six:
- rsync [OPTION]... SRC DEST
- rsync [OPTION]... SRC [USER@]HOST:DEST
- rsync [OPTION]... [USER@]HOST:SRC DEST
- rsync [OPTION]... [USER@]HOST::SRC DEST
- rsync [OPTION]... SRC [USER@]HOST::DEST
- rsync [OPTION]... rsync://[USER@]HOST[:PORT]/SRC [DEST]
Corresponding to the above six command formats, rsync has six different working modes:
- Copy local files. This mode is enabled when neither the SRC nor the DES path information contains a single colon ":" separator. For example: rsync -a /data /backup
- Use a remote shell program (such as rsh, ssh) to copy the contents of the local machine to the remote machine. This mode is enabled when the DST path address contains a single colon ":" separator. For example: rsync -avz *.c foo:src
- Use a remote shell program (such as rsh, ssh) to copy the contents of the remote machine to the local machine. This mode is started when the SRC address path contains a single colon ":" separator. For example: rsync -avz foo:src/bar /data
- Copies files from a remote rsync server to the local machine. This mode is enabled when the SRC path contains "::" separators. For example: rsync -av root@172.16.78.192::www/databack
- Copies files from the local machine to the remote rsync server. This mode is enabled when the DST path information contains "::" separators. For example: rsync -av /databack root@172.16.78.192::www
- List the files on the remote machine. This is similar to rsync transmission, but just omit the local machine information in the command. For example: rsync -v rsync://172.16.78.192/www
The specific explanation of rsync parameters is as follows:
- -v, --verbose Verbose mode output
- -q, --quiet Quiet output mode
- -c, --checksum Turn on the checksum switch to force a checksum on the file transfer
- -a, --archive Archive mode, which means transferring files recursively and keeping all file attributes, which is equal to -rlptgoD
- -r, --recursive Process subdirectories recursively
- -R, --relative Use relative path information
- -b, --backup creates a backup, that is, if the same file name already exists for the destination, rename the old file to ~filename. You can use the --suffix option to specify a different backup file prefix.
- --backup-dir Store the backup file (such as ~filename) in the directory.
- -suffix=SUFFIX defines the backup file prefix
- -u, --update Only update, that is, skip all files that already exist in DST and whose file time is later than the files to be backed up. (Do not overwrite updated files)
- -l, --links keep soft links
- -L, --copy-links Treat soft links like regular files
- --copy-unsafe-links Copy only links pointing outside the SRC path directory tree
- --safe-links Ignore links pointing outside the SRC path directory tree
- -H, --hard-links keep hard links
- -p, --perms preserve file permissions
- -o, --owner keep file owner information
- -g, --group keep file group information
- -D, --devices keep device file information
- -t, --times keep file time information
- -S, --sparse Treat sparse files specially to save space in DST
- -n, --dry-run shows which files will be transferred
- -W, --whole-file copy file without incremental detection
- -x, --one-file-system Do not cross file system boundaries
- -B, --block-size=SIZE The block size used by the verification algorithm, the default is 700 bytes
- -e, --rsh=COMMAND specifies to use rsh or ssh for data synchronization
- --rsync-path=PATH specifies the path information of the rsync command on the remote server
- -C, --cvs-exclude Use the same method as CVS to automatically ignore files, used to exclude files that you do not want to transfer
- --existing only updates files that already exist in DST, and does not back up newly created files
- --delete Delete files in DST that are not in SRC
- --delete-excluded Also delete the files on the receiving end that are excluded by this option
- --delete-after delete after the transfer is completed
- --ignore-errors Delete even if IO errors occur
- --max-delete=NUM Delete at most NUM files
- --partial retains files that were not fully transferred for some reason, thus speeding up subsequent retransmissions
- --force Force deletion of a directory even if it is not empty
- --numeric-ids Do not match numeric user and group IDs as user and group names
- --timeout=TIME IP timeout, in seconds
- -I, --ignore-times Do not skip files with the same time and length
- --size-only When deciding whether to back up a file, only look at the file size without considering the file time
- --modify-window=NUM The timestamp window used to determine whether the files have the same time, the default is 0
- -T --temp-dir=DIR create temporary files in DIR
- --compare-dest=DIR Also compare the files in DIR to decide whether they need to be backed up
- -P is equivalent to --partial
- --progress Display backup progress
- -z, --compress compresses the backup files during transfer
- --exclude=PATTERN specifies the pattern of files that do not need to be transferred
- --include=PATTERN specifies the file pattern that should not be excluded and should be transferred
- --exclude-from=FILE exclude files with the specified pattern in FILE
- --include-from=FILE Do not exclude files matching the pattern specified by FILE
- --version print version information
- --address Bind to a specific address
- --config=FILE specifies another configuration file instead of the default rsyncd.conf file
- --port=PORT specifies another rsync service port
- --blocking-io Use blocking IO for remote shell
- -stats gives the transfer status of certain files
- --progress Display the progress of the transfer during the transfer
- --log-format=formAT Specify the log file format
- --password-file=FILE get password from FILE
- --bwlimit=KBPS Limit I/O bandwidth, KBytes per second
- -h, --help Display help information
Configuration of rsync under liunx
Rsync 3.2.4 released
April 15, 2022
Rsync version 3.2.4 has been released. Another typical release with bug fixes and some enhancements. It also contains the bundled zlib 1.2.8 Security fixes , which may or may not be used in your particular build configuration.
Notice:Now there isA patch fixes the configure check for "signed char" when "-pedantic-errors" is in effect. This works on systems where "char" defaults to "unsigned char" (e.g. ARM systems) and returns the rsync algorithm to full efficiency (without the patch, the transfer would send more literal data than looking for matching local data).
See also3.2.4 News See the detailed changelog. The latest manpages are also available at:
The source tarball is available here: rsync-3.2.4.tar.gz (sign), the tarball of the "patch" directory is distributed in a separate file: rsync-patches-3.2.4.tar.gz (sign), and the diff for version 3.2.2 is available here: rsync-3.2.3-3.2.4.diffs.gz (sign).
rsync server configuration
- Install rsync briefly
- Modify the configuration file Vim etc/rsyncd.conf
uid=root
gid=root
max connections=4
use chroot=no
log file=/var/log/rsyncd.log
pid file=/var/run/rsyncd.pid
lock file=/var/run/rsyncd.lock
secrets file=/etc/rsyncd.pwd
hosts allow = 10.10.0.200
hosts deny = 0.0.0.0/0
[oa]
path=/home/sxit/appbak
comment = backup file
ignore errors
read only = no
list = yes
auth users = root
- Create a pwd file and modify the file permissions
Vim rsyncd.pwd
root:password
Chmod 600 rsyncd.pwd
- Start /usr/bin/rsync –daemon --config=/etc/rsyncd.conf may require root privileges
(Check the /etc/init.d/rsyncd and /etc/xinetd.d/rsync files to determine whether the service is started by a super daemon or an independent process Service xinetd start / service rsyncd start )
Rsync10.10.0.200 client configuration
- Create a file to store passwords
Vim /etc/rsyncd.pwd
Password
Chmod 600 rsyncd.pwd
- Synchronize server data /usr/bin/rsync –vazp –progress –password-file=/etc/rsyncd.pwd root@10.10.1.3::oa /home1/sxit/appbak (local storage directory)
- You can put this command into the plan and execute it regularly.
Vim /etc/crontab
01 04 * * * root /usr/bin/rsync -vzau --progress --password-file=/etc/rsyncd.pwd root@10.10.1.3::oa /home1/sxit/appbak
rsyncexistwindowsConfiguration under
Rsync Server Configuration
- Install rsync
rsync official site:
https://rsync.samba.org/
[ WINDOWS RSYNC Server ]
https://itefix.net/dl/free-software/cwrsync_6.2.4_x64_free.zip
https://itefix.net/dl/free-software/cwrsync_5.5.0_x86_free.zip
chooseInstallation Path E:\cwRsyncServer, Note: rsync is a wrapper for cgywin under Windows
During installation, Rsync will do the following:
a. Create a new user SvcwRsync and make this user an administrator. A password will be generated.
b. Set the permissions of the installation directory E:\cwRsyncServer. Please do not change the permissions yourself.
c. Create service RsyncServer
In lower versions, you may need to do the above steps manually and create the service manually as shown below, but here all three steps of the installation program are already done.
cygrunsrv.exe -I "Rsync" -p /cygdrive/d/cwRsyncServer/bin/rsync.exe -a "--config=/cygdrive/d/cwRsyncServer/etc/rsyncd.conf --daemon --no-detach"
-f "Rsync" -u Administrator -w 123456
- Modify the configuration file conf
use chroot = false
strict modes = false
hosts allow = 10.10.1.3
log file = rsyncd.log
pid file = rsyncd.pid
secrets file = /cygdrive/e/cwRsyncServer/etc/rsyncd.txt
# Module definitions
# Remember cygwin naming conventions: c:\work becomes /cygwin/c/work
#
[mail]
path = /cygdrive/d/MDaemon/Users
readonly = false
ignore errors
auth users = root
transfer logging = yes
- Create an account and assign permissions
In the configuration file, I use the root account, so I need to create a new root account, configure a password, add the root account to the administrator group, and give the root account full control permissions in d/MDaemon/Users
- Create a txt file
root:password
- Start the service
Start the rsyncserver service in the service and set it to start automatically. If normal, you can see that port 873 is listening
Rsync Client Configuration
- Create a new password storage file
Vim /etc/rsync.pwd
Password
Chmod 600 rsync.pwd
2./usr/bin/rsync –auv –progress –password-file=/etc/rsync.pwd root@10.10.3.1::mail /home/maiusers (without parameter Z, it can connect but cannot transfer files)