View previous topic :: View next topic |
Author |
Message |
kortex- n00b
Joined: 24 Apr 2014 Posts: 14
|
Posted: Fri Nov 07, 2014 10:25 am Post subject: Rsync : synchronizes the entire file |
|
|
Hello,
I have a problem with rsync...
I have an AutoFS mountpoint (nfs protocol) which is created in /mnt/backup.
I also have a local directory "/home/backup".
I run rsync every day to sync the contents of "/mnt/backup" in "/home/backup".
Command is : /usr/bin/sudo -u nobody /usr/bin/rsync -av --stats /mnt/backup/* /home/backup/.
I have a lot of large binary files (1TB).
The problem is that rsync synchronizes all content and not the difference.
I tried several settings found on the internet but every time I have the same worries.
I read that to use "--no-whole-file" and / or "--inplace" but with rsync parameters take a long time to calculate the checksum for large files ...
Does anyone have a solution to transfer the binary difference quickly ?
Thank you in advance. |
|
Back to top |
|
|
eccerr0r Watchman
Joined: 01 Jul 2004 Posts: 9677 Location: almost Mile High in the USA
|
Posted: Fri Nov 07, 2014 1:59 pm Post subject: |
|
|
Don't rsync over nfs.
The problem is that when you do rsync over nfs, it has to grab the contents over the network in order to even tell what the differences are. So ideally you have something like bittorrent that does block hashes and transfers individual blocks that changed...
... which is something that rsync can also do, but only when using a rsync server or over ssh/rsh. What you want is a fast checksum compute which you want to do of as little as possible over the network. _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
|
kortex- n00b
Joined: 24 Apr 2014 Posts: 14
|
Posted: Fri Nov 07, 2014 4:30 pm Post subject: |
|
|
Thank you for your answer.
But there are two things I do not understand :
- why on smaller files rsync does not sync the entire file every time ?
- what's the difference between NFS and SSH ? In both cases it's on network |
|
Back to top |
|
|
Atom2 Apprentice
Joined: 01 Aug 2011 Posts: 185
|
Posted: Fri Nov 07, 2014 11:06 pm Post subject: |
|
|
kortex- wrote: | - what's the difference between NFS and SSH ? In both cases it's on network |
You are right that it both is over a network, but the difference is that for an NFS mounted filesystem all data from the NFS server needs to be transfered to the local rsync process running on the NFS client.
In contract to this the ssh solution starts a remote rsync process on the NFS server (which is thus running locally on the NFS server) which then is able to read data directly from its attached disks (i.e. on the NFS server) and not through a network connection. In this case only the two rsync processes (i.e. the one on the client and the one on the server) communicate with each other and there is no need to transfer the complete file.
Ths ssh solution is conceptually identical to having an on-demand rsync server on the NFS server (i.e. one that does not constantly listen for incoming connections but is only started on demand through ssh's remote command execution on the target system).
I hope that helps Atom2 |
|
Back to top |
|
|
kortex- n00b
Joined: 24 Apr 2014 Posts: 14
|
Posted: Wed Nov 12, 2014 7:58 am Post subject: |
|
|
Hello,
I tried RSYNC over SSH but I have the same problem : the whole file is transferred.
I tried two commands :
/usr/bin/sudo -u nobody /usr/bin/rsync -av --stats -e "ssh -o StrictHostKeyChecking=no -i /rsync/id_rsa" user@source:/home/backup/* /home/backup/.
/usr/bin/sudo -u nobody /usr/bin/rsync -av --stats --no-whole-file -e "ssh -o StrictHostKeyChecking=no -i /rsync/id_rsa" user@source:/home/backup/* /home/backup/.
I will try this command tonight :
/usr/bin/sudo -u nobody /usr/bin/rsync -av --stats --no-whole-file --inplace --checksum -e "ssh -o StrictHostKeyChecking=no -i /rsync/id_rsa" user@source:/home/backup/* /home/backup/. |
|
Back to top |
|
|
eccerr0r Watchman
Joined: 01 Jul 2004 Posts: 9677 Location: almost Mile High in the USA
|
Posted: Wed Nov 12, 2014 3:49 pm Post subject: |
|
|
I think it's block based and not a true diff. If you delete one byte from the beginning of the file, the checksum of all blocks will now fail and the whole file gets transferred.
Not sure what the nature of the changes you have... _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
|
kortex- n00b
Joined: 24 Apr 2014 Posts: 14
|
Posted: Wed Nov 12, 2014 4:47 pm Post subject: |
|
|
The files are MongoDB's database (binary files).
We make no suppression and no update; just inserts. |
|
Back to top |
|
|
kortex- n00b
Joined: 24 Apr 2014 Posts: 14
|
Posted: Thu Nov 13, 2014 1:14 pm Post subject: |
|
|
Hello,
The problem is the same with these arguments: the whole file is transferred.
If someone has an idea I'm interested |
|
Back to top |
|
|
kortex- n00b
Joined: 24 Apr 2014 Posts: 14
|
Posted: Wed Nov 19, 2014 4:28 pm Post subject: |
|
|
There is nobody who has a solution?
Thank you in advance. |
|
Back to top |
|
|
WWWW Tux's lil' helper
Joined: 30 Nov 2014 Posts: 143
|
Posted: Sun Nov 30, 2014 6:56 pm Post subject: |
|
|
A few things I don't understand.
Do you have large binaries of 1TB each? Or large binaries that the total of them amount to 1TB?
I think rsync falls short for incremental syncs. Some filesystems get around this problem though. |
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3131
|
Posted: Sun Nov 30, 2014 8:05 pm Post subject: |
|
|
Quote: |
The files are MongoDB's database (binary files).
We make no suppression and no update; just inserts. |
Do those inserts move following data?
I mean, if you append your file to some kind of header (insert header on position "0"), all the data inside is pushed forward by the size of header you prepended, right?
So, perhaps rsync doesn't recognize the file anymore because the data has been moved inside the file, so it doesn't match block end position anymore?
Does the same issu ocur when you append data to file?
Quote: | something that rsync can also do, but only when using a rsync server or over ssh/rsh. |
Actually you can also use rsync over rsync. It requires you to setup rsync daemon though, so the process you start will be able to connect.
And one thing more, if you want to call something bakcup, you usually want to keep several versions. You can run rsync from location A against reference location B (e.g. your last backup) to let it create new target C. Rsync should then copy stuff new in A to C and create hard links from C to B for files that hasn't changed since last run.
If you don't want to keep several versions, perhaps you just want to have a mirror? |
|
Back to top |
|
|
EmaRsk Apprentice
Joined: 07 Sep 2004 Posts: 158 Location: Italy
|
Posted: Tue Dec 02, 2014 3:43 pm Post subject: |
|
|
I don't know if this can be useful, but rsync's man page says:
man rsync wrote: | -B, --block-size=SIZE force a fixed checksum block-size |
|
|
Back to top |
|
|
|