View previous topic :: View next topic |
Author |
Message |
KpR2000 n00b
Joined: 18 Aug 2003 Posts: 55
|
Posted: Thu Jun 10, 2004 10:09 am Post subject: |
|
|
flybynite wrote: |
The easiest fix for the name lookup failures in your logs is to list the ip's and hostsnames in /etc/hosts.
|
The ip's for the rsync server and all other pc's in my network are listed there.
I have also not used an hostname in /etc/make.conf for the SYNC variable:
SYNC="rsync://192.168.3.84/gentoo-portage"
Quote: |
I noticed that you seem to be comparing your rsync server speed with someones distfile cache speed in this thread. Two different things.
Now that you have your config file fixed, what speeds are you getting?
|
Thats my current state:
Code: |
receiving file list ...
1 file to consider
timestamp.chk
32 100% 31.25kB/s 0:00:00 (1, 100.0% of 1)
Number of files: 1
Number of files transferred: 1
Total file size: 32 bytes
Total transferred file size: 32 bytes
Literal data: 32 bytes
Matched data: 0 bytes
File list size: 32
Total bytes written: 226
Total bytes read: 437
wrote 226 bytes read 437 bytes 442.00 bytes/sec
total size is 32 speedup is 0.05
|
and the collected packages counter is really slow in contrast to the "internet sync". > like the previous answer by "CarpJA"
Greetings |
|
Back to top |
|
|
dhurt Apprentice
Joined: 14 May 2003 Posts: 278 Location: Davis, CA
|
Posted: Thu Jun 10, 2004 4:04 pm Post subject: |
|
|
KpR2000 wrote: |
Code: |
receiving file list ...
1 file to consider
timestamp.chk
32 100% 31.25kB/s 0:00:00 (1, 100.0% of 1)
Number of files: 1
Number of files transferred: 1
Total file size: 32 bytes
Total transferred file size: 32 bytes
Literal data: 32 bytes
Matched data: 0 bytes
File list size: 32
Total bytes written: 226
Total bytes read: 437
wrote 226 bytes read 437 bytes 442.00 bytes/sec
total size is 32 speedup is 0.05
|
|
This is way to little data to get the speed from. The time stampfile contains just this:
Code: |
Thu Jun 10 15:06:57 UTC 2004
|
Downloading from the lan or the internet even over a 56K modem on something this small will be the same. That is not the bottleneck. The problem with trying to see a speed increase on the emerge sync is that you will not see one. You are transfering lots of very small files. This is NOT a bandwidth intensive process. Look at my results from the main portage tree sync. As you can see the internet sync is faster. Maybe by about 5 seconds out of 40. Nothing big. Also the amount of data transfered is 300K which is tiny.
Code: |
<Internet>
------------------------------------------------------------
Number of files: 85256
Number of files transferred: 182
Total file size: 69887332 bytes
Total transferred file size: 341443 bytes
Literal data: 341443 bytes
Matched data: 0 bytes
File list size: 1944462
Total bytes written: 3825
Total bytes read: 2083098
wrote 3825 bytes read 2083098 bytes 46897.15 bytes/sec
total size is 69887332 speedup is 33.49
<Local Mirror>
------------------------------------------------------------
Number of files: 86006
Number of files transferred: 180
Total file size: 69896735 bytes
Total transferred file size: 340442 bytes
Literal data: 340442 bytes
Matched data: 0 bytes
File list size: 2036955
Total bytes written: 3785
Total bytes read: 2177599
wrote 3785 bytes read 2177599 bytes 37937.11 bytes/sec
total size is 69896735 speedup is 32.04
|
This process I found is HIGHLY dependent on server load. I tried syncing while the main server was caching the portage files and the process was slow as can be. I know that rsyncing causes major load on the server. Not just disk activity, but also computational activity. So the bottleneck can be somewhere else, probably the speed of the computer. The gentoo servers are usally dual processor servers optimized to be a rsync mirror and your desktop/local server is probably not and so the process will be slower. But it will not be an unbearably longder time, the actual rsync time is very quick and caching the portage tree takes much longer than the rsync process anyway. The reason for creating a local mirror is not so that it is faster, as you can see mine was slower. But to reduce load on the gentoo servers. You really do not need to have two copies of the same information downloaded. That is wasteful. If you want a speed increase Take a look here:
https://forums.gentoo.org/viewtopic.php?t=173226&highlight=
This http-replicator and allows you to cache all the portage files and will then serve them up at lan speeds _________________ "And isn't sanity really just a one-trick pony, anyway? I mean, all you get is one trick, rational thinking, but when you're good and crazy, ooh ooh ooh, the sky's the limit!" -- The Tick |
|
Back to top |
|
|
KpR2000 n00b
Joined: 18 Aug 2003 Posts: 55
|
Posted: Thu Jun 10, 2004 4:29 pm Post subject: |
|
|
You are right with your arguments. I will give http-replicator a try.
Thx |
|
Back to top |
|
|
mxc Guru
Joined: 05 Mar 2003 Posts: 442 Location: South Africa
|
Posted: Sat Jun 12, 2004 6:31 am Post subject: |
|
|
Is it possible to set the client up to fallback to an external rsync server if it cannout find the file it needs on the local server? I have an adsl connection with a cap limit. I often need to install machines over night and I would prefer the machine to finish compiling than save bandwidth in this case.
thanks |
|
Back to top |
|
|
dhurt Apprentice
Joined: 14 May 2003 Posts: 278 Location: Davis, CA
|
Posted: Sat Jun 12, 2004 6:42 am Post subject: |
|
|
You are confusing an rsync mirror and a package mirror. The rsync mirror which this post is about will not have any packages in it. It allows you to:
on just one machine and then replicate that effort to other machines on the lan to reduce the load on the Gentoo mirrors. It just syncronizes /usr/portage, but excludes /usr/portage/distfiles and /usr/portage/packages. So there are no files that are skipped unless you have a funky setup.
I think you are refering to setting up a local package mirror which you would want to use http-replicator which is another part to the system. It basically caches all the files that you have downloaded for building purposes locally. If it cannot find a file it then it downloads it from the internet.
A link to it is about 3 posts above. _________________ "And isn't sanity really just a one-trick pony, anyway? I mean, all you get is one trick, rational thinking, but when you're good and crazy, ooh ooh ooh, the sky's the limit!" -- The Tick |
|
Back to top |
|
|
mxc Guru
Joined: 05 Mar 2003 Posts: 442 Location: South Africa
|
Posted: Sun Jun 13, 2004 7:02 am Post subject: |
|
|
Thanks KillBill,
In the one post I found the poster had set up a rsync 'link' to the portage/distfiles directory. Would just sinking this with another machine not mean that I have all the files the other has and there will only be a need to download ones which I don't already have?
Would rsyncing the distfiles dir skip some important step that emerge needs?
I will look into setting up the http proxy as a longer term solution later.
thanks |
|
Back to top |
|
|
dhurt Apprentice
Joined: 14 May 2003 Posts: 278 Location: Davis, CA
|
Posted: Sun Jun 13, 2004 4:11 pm Post subject: |
|
|
The problem with the rsync solution is that there is not a fall back if the package is not on the main server. A poor solution because if you are upgrading alot of packages and it cannot download the file halfway through it, the ebuild will fail. You then have to change your mirror and download the file manually, or go to your server, download the file manually, change back your mirror so it is pointed at your local server. Finally continue the ebuild.
I know I used it for about 2 months. It was a pain to keep up in the long run and not transparent at all.
http-replicator on the other hand is very seemless. It is a proxy between you and the internet just for the purposes of getting distfiles. It does not mess with your traffic in anyother way. How it works, is all requests for files come through the proxy now. If it has the file locally, it serves them up at LAN speeds. If it does not have the file, it fetches the file locally to the proxy and sending it to the requesting machine at the same time. It is very seemless. So if you have the file on your proxy it, comes in at LAN speeds, if not it comes in at the speed of your connection. I have been using it for 2-3 weeks now and it is excellent. _________________ "And isn't sanity really just a one-trick pony, anyway? I mean, all you get is one trick, rational thinking, but when you're good and crazy, ooh ooh ooh, the sky's the limit!" -- The Tick |
|
Back to top |
|
|
seringen Apprentice
Joined: 03 Aug 2003 Posts: 163 Location: berkeley, california
|
Posted: Tue Jun 15, 2004 6:29 am Post subject: just adding my data |
|
|
Well, other than a stupid carriage return error in a config file, everything worked immediately and beautifully.
To give people an idea about what would be typical performance for most people here's an example from my network:
First the rsync server over ssh, a VIA Nehemiah computer
Code: | # hdparm -tT /dev/hda
/dev/hda:
Timing buffer-cache reads: 520 MB in 2.01 seconds = 258.10 MB/sec
Timing buffered disk reads: 122 MB in 3.06 seconds = 39.89 MB/sec |
Now the connecting computer, a PIII Laptop with a slow, normal harddrive
Code: | # hdparm -tT /dev/hda
/dev/hda:
Timing buffer-cache reads: 416 MB in 2.00 seconds = 207.51 MB/sec
Timing buffered disk reads: 54 MB in 3.12 seconds = 17.33 MB/sec
|
Over fast ethernet it gets
All in all not bad and without any optimizations of any sort, and it really is a good thing to take some of the weight off of the main mirrors - it's easy to forget how heavy rsync is on servers. |
|
Back to top |
|
|
Cetanu n00b
Joined: 16 Jun 2004 Posts: 1
|
Posted: Wed Jun 16, 2004 7:53 pm Post subject: Local mirror outside /usr/portage |
|
|
Is there any reason to keep portage for unofficial mirror outside server's /usr/portage directory?
I am asking because I installed app-admin/gentoo-rsync-mirror package today and portage is kept in separate directory by default (/opt/gentoo-rsync/portage/). I use configuration with portage keep in /usr/portage for half of year and I haven't experienced any problems yet... |
|
Back to top |
|
|
dhurt Apprentice
Joined: 14 May 2003 Posts: 278 Location: Davis, CA
|
Posted: Wed Jun 16, 2004 9:57 pm Post subject: |
|
|
Works great with the directory /usr/portage/. Maybe on the server configuration they like to mount the /usr directory read only until update times and storing this in opt would allow them to do this and still have an upto date mirror. _________________ "And isn't sanity really just a one-trick pony, anyway? I mean, all you get is one trick, rational thinking, but when you're good and crazy, ooh ooh ooh, the sky's the limit!" -- The Tick |
|
Back to top |
|
|
flybynite l33t
Joined: 06 Dec 2002 Posts: 620
|
Posted: Thu Jun 17, 2004 9:03 am Post subject: |
|
|
New HOWTO version 1.2 !
I added a note about the rsync daemon nicelevel that my /etc/init.d/rsyncd script sets on starting. This applies only if you use my script on your machine.
My script sets the nicelevel to a lower priority (15) than normal (0) because I spend time logged in on my rsync server box and use it as a normal desktop. If you do also, leave it set as is. If you only use your rsync server as a server go ahead and set the nicelevel to 0 to give normal priority to make rsync run at normal speed. |
|
Back to top |
|
|
JSharku Apprentice
Joined: 09 Feb 2003 Posts: 189 Location: Belgium
|
Posted: Thu Jun 17, 2004 8:53 pm Post subject: |
|
|
Just a quick note on packages and distfiles; it's better to put the following in your rsyncd.conf:
Code: |
# excluding packages is optional, if you don't use --buildpkg you don't need it
exclude = distfiles/ packages/
|
instead of
Code: |
exclude = distfiles packages
|
NOTE THE TRAILING /'s
If you don't add the slashes rsync will exclude anything ending in either distfiles or packages, not just those directories. Not that big a deal you might say, were it not that every /usr/portage/profiles/<specific profile>/ directory has a file in it called packages which portage uses to determine what to build when you bootstrap or emerge system. Those files get deleted by rsync on the client machines if you don't add the trailing slashes, resulting in rebuilds, rebootstraps, resyncs and tons of frustration... at least it did for me until I finally figured this out.
Sharku |
|
Back to top |
|
|
flybynite l33t
Joined: 06 Dec 2002 Posts: 620
|
Posted: Fri Jun 18, 2004 5:41 am Post subject: |
|
|
JSharku wrote: | Just a quick note on packages and distfiles; it's better to put the following in your rsyncd.conf:
Code: |
# excluding packages is optional, if you don't use --buildpkg you don't need it
exclude = distfiles/ packages/
|
|
I see your point about what a trailing / does and you are correct, but I'd bet you're doing this for the wrong reasons and you don't need it either!!
First, you're correct about the trailing slash in the exclude pattern ensuring it only excludes directories and not files. I've updated my howto just to make it clear what is being excluded, but it probably doesn't matter if any user changes their config.
The reason it doesn't matter is were dealing with the SERVER. The exclude in the SERVER config makes it impossible for a client to TRY and get distfiles (or packages, in your config) by rsync.
But portage will NOT request those files!!!!!
Look at file:/usr/lib/portage/bin/emerge for rsync_flags and you'll find that the portage CLIENT sets rsync options that automatically skips distfiles, local, and packages.
Code: |
"--exclude='distfiles/*'", # Exclude distfiles from consideration
"--exclude='local/*'", # Exclude local from consideration
"--exclude='packages/*'", # Exclude packages from consideration
|
So to wrap this up:
1. Gentoo's portage automatically skips distfiles, local, and packages when syncing so you don't have to exclude these in the SERVER config, and they won't ever appear on clients when you 'emerge sync'.
2. Excluding distfiles in the SERVER config only serves to prevent anyone from abusing the server using their own rsync command. It is possible to create your own rsync request that would try to suck down all of distfiles from the public rsync servers. Excluding distfiles on the SERVER prevents a user from doing this. I left this in as protection for those running my local rsync server in a college campus, for example.
However, if you're running a semi public server on a gentoo box with alot of packages and your afraid someone might try to craft an rsync command to get all your packages, exclude distfiles/ packages/ per JSharku's example above. |
|
Back to top |
|
|
JSharku Apprentice
Joined: 09 Feb 2003 Posts: 189 Location: Belgium
|
Posted: Fri Jun 18, 2004 6:40 pm Post subject: |
|
|
When I first set up my local rsync server, an emerge sync would try to pull in the distfiles and packages, so I added that line to my rsyncd.conf, which worked at the time (portage 2.0.4x, 1-1.5 years ago ) so I kept it in there. I didn't know it had been added to portage, so I kept the line thinking it was necessary. It's only very recently that I discovered it was messing with emerge system, but I still didn't know that the exclude line itself had become obsolete.
Sharku |
|
Back to top |
|
|
flybynite l33t
Joined: 06 Dec 2002 Posts: 620
|
Posted: Mon Jun 21, 2004 9:44 am Post subject: |
|
|
No problem, Its hard to keep up with all the changes. Thanks for helping make the syntax clear in the config. |
|
Back to top |
|
|
_sparks_ n00b
Joined: 12 Jan 2004 Posts: 1
|
Posted: Thu Jun 24, 2004 9:31 am Post subject: |
|
|
try turning logging off
/etc/rsync/rsyncd.conf:
Code: |
#This will log every file transferred - up to 85,000+ per user, per sync
transfer logging = no
|
speeds up things in my configuration by a factor of 100 or so
Last edited by _sparks_ on Fri Jun 25, 2004 12:05 pm; edited 1 time in total |
|
Back to top |
|
|
fvant Guru
Joined: 08 Jun 2003 Posts: 328 Location: Leiden, The Netherlands
|
Posted: Thu Jun 24, 2004 10:04 am Post subject: |
|
|
My local rsync server seems to sync in blocks of 200 files only. Where as the internet download filecounter can barely be followed, rsync from my local server steps slowly in steps of 200
CPU on the server i rsync form is not busy and rsync process only uses 2.3%, HD use DMA |
|
Back to top |
|
|
Marwin n00b
Joined: 27 Oct 2002 Posts: 58
|
Posted: Thu Jun 24, 2004 11:01 am Post subject: |
|
|
Take your samba-server and make a directory that you call 'distfiles'.
Share it and make the clients mount it at /usr/portage/distfiles.
And Wolla! You've got a shared distfiles _________________ [ Never trust an operationsystem you don't have sources for ] |
|
Back to top |
|
|
quill18 n00b
Joined: 20 Jan 2004 Posts: 50
|
Posted: Thu Jun 24, 2004 7:38 pm Post subject: |
|
|
fvant wrote: | My local rsync server seems to sync in blocks of 200 files only. Where as the internet download filecounter can barely be followed, rsync from my local server steps slowly in steps of 200
CPU on the server i rsync form is not busy and rsync process only uses 2.3%, HD use DMA |
Ditto on this. Very similar performance. Lots of spare CPU, bandwidth, and harddrive speed but terrible throughput.
Made sure that the hostnames are setup properly, and tried it with the original startup script as well as the one posted above. |
|
Back to top |
|
|
flybynite l33t
Joined: 06 Dec 2002 Posts: 620
|
Posted: Thu Jun 24, 2004 10:30 pm Post subject: |
|
|
It appears some users are using other rsync.conf file and startup scripts and are having problems. Someone even posted a bad rsync.conf in this thread!!
The reason I posted this HOWTO is to eliminate the junk floating around!!
Everyone check that you are using the exact config and startup scripts in the HOWTO!! That will eliminate many problems!! |
|
Back to top |
|
|
Nekkrist n00b
Joined: 09 Oct 2003 Posts: 33
|
Posted: Mon Jun 28, 2004 3:32 am Post subject: |
|
|
For everyone having speed related issues with your sync'ing, this is probably not a network problem, configuration problem, or anything of the sort. It is probably simply an aspect of computer hardware.
The reason the rsync server's appear to be so fast is that all they do all day is offer syncing services. Your local mirror, however, does not do this all day, in fact it probably very rarely is sync'd against.
Since the inner workings of the rsync algorithm are somewhat detailed, if you are interested, read http://samba.org/~tridge/phd_thesis.pdf (the rsync author's PhD thesis which includes a few chapters on rsync).
Otherwise, the basic result is that the rsync protocol operations are cached by the CPU cache of the main rsync mirrors, so that they don't actually need to be performed every single time. If you happened to be the very first person to sync against a main server after it was turned on, you would see very similar results to your own server. Your own server does not have these operations in the CPU cache since when you sync, that is likely the first time it has been sync'd against since its update.
If you have three or four computers, let one sync to the server, then after that one has completed, do another sync, chances are it will be a bunch faster than your previous findings. |
|
Back to top |
|
|
flybynite l33t
Joined: 06 Dec 2002 Posts: 620
|
Posted: Thu Jul 01, 2004 8:20 am Post subject: |
|
|
Thanks for some more info Nekkrist!
There are many things to consider about your rsync speed:
1. CPU/Memory - your old Pent II 233mhz isn't going to be as fast as an official rsync server such as crane.gentoo.org with it's Dual 1.7GHZ Xeon's and 2GB ram.
2. Filesystem/Disk Speed - Rsync has to consider about 85,000 small files in many dir's. Put your /usr/portage on a fast disk with a filesystem that has high small file performance.
3. Disk Cache - The second rsync will be faster than the first.
4. Logging - My config has logging turned off because every client rsync will generate 85,000+ lines in the log file!
5. More.... |
|
Back to top |
|
|
dmitrio Tux's lil' helper
Joined: 10 Dec 2002 Posts: 115 Location: Pago Pago
|
Posted: Thu Jul 01, 2004 12:29 pm Post subject: :. copied to gentoo-wiki.com |
|
|
I have copied this HOWTO, with permission of flybynite, to gentoo-wiki.com
http://gentoo-wiki.com/HOWTO_Local_Rsync_Mirror
If you see anything that should be added or changed, feel free to do so.
Thank you for a great HOWTO. _________________
... Leaving ground, destination is unknown,
into the darkness and far away from home,
Will your dream come true and what will you find,
when fate is your guide ... |
|
Back to top |
|
|
flybynite l33t
Joined: 06 Dec 2002 Posts: 620
|
Posted: Sun Jul 04, 2004 8:17 am Post subject: |
|
|
I appreciate that dmitrio, the wiki should help get the word out!!
I also submitted the howto's for possible inclusion in Gentoo Weekly Newsletter as suggested by monkeywrench on the http-Replicator thread https://forums.gentoo.org/viewtopic.php?t=173226 |
|
Back to top |
|
|
dmitrio Tux's lil' helper
Joined: 10 Dec 2002 Posts: 115 Location: Pago Pago
|
Posted: Sun Jul 04, 2004 12:20 pm Post subject: :. copied to gentoo-wiki.com |
|
|
Thank you for good HOWTO
please look at
http://gentoo-wiki.com/HOWTO_Download_Cache_for_LAN-Http-Replicator
If you see anything that should be added or changed, feel free to do so. _________________
... Leaving ground, destination is unknown,
into the darkness and far away from home,
Will your dream come true and what will you find,
when fate is your guide ... |
|
Back to top |
|
|
|