Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[LONG] Proposal for an alternative portage tree sync method
View unanswered posts
View posts from last 24 hours

Goto page 1, 2, 3  Next  
Reply to topic    Gentoo Forums Forum Index Unsupported Software
View previous topic :: View next topic  
Author Message
wizeman
n00b
n00b


Joined: 11 May 2004
Posts: 12
Location: Portugal

PostPosted: Tue Mar 22, 2005 4:40 am    Post subject: [LONG] Proposal for an alternative portage tree sync method Reply with quote

Warning: long 8O

Hi

Recently I've been having some problems doing the 'emerge sync' routine, which I think are related to my ISP connection.
The only alternative, of course, is the far from optimal 'emerge-webrsync'.

Both of these methods have problems.

Specifically, the rsync method has these problems:

  1. Doesn't work behind restrictive firewalls
  2. Due to the large number of files in the portage tree, it takes a lot of time to build/transfer this list
  3. Involves a lot of disk trashing, also due to the large number of files
  4. Puts a lot of stress on the server

The emerge-webrsync method is also not without problems:

  1. Transfers the whole (compressed) portage tree at once
  2. .. which is not an optimal (read: viable) option for daily updates
  3. Involves a lot of disk thrashing anyway, due to the subsequent local rsync

Alternative methods could involve CVSup, which was already considered before, however:

  1. Doesn't work behind restrictive firewalls
  2. It's written in Modula-3, which is not available on all architectures (AMD64 for instance, although it's getting more portable)
  3. Also involves disk thrashing (it's less of a bottleneck than rsync, however)
  4. Also puts (some?) stress on the server

So I would like to propose an idea for an alternative portage syncing method.

This method is based on zsync (ebuild available here).
zsync basically synchronizes a local file using the rsync algorithm, but through a normal HTTP server.
To do this, it only needs a pre-generated .zsync file on the server.

One of the most useful features of zsync, however, is that the file to be synchronized can be a common gzipped file on the server-side, and zsync only downloads the (compressed) ranges which are really necessary to synchronize the file.
But the main algorithm, of course, is still the rsync algorithm.
You can read more about this, including performance comparisons to rsync, on the technical paper.

Because zsync is designed to be easy to use, the operations needed to synchronize the portage tree would be very simple.

The method I present here would involve these operations on the server-side:

  1. Create a compressed (gzipped) iso file of the portage tree on the server-side.
    This file would weight at about 27 MB nowadays.
  2. Create the .zsync file (very fast, and about 700 KB with a 4K block size).
  3. Move both files to the public HTTP directory.

On the client side, only this would be necessary:

  1. Use zsync to synchronize the portage ISO (this file would be about 260 MB).
  2. Unmount /usr/portage (old ISO file is still unaltered on disk).
  3. Mount (-o ro,loop) /usr/portage (with the new ISO file).
  4. Remove old ISO file.

As you can see, the ISO file is mounted in /usr/portage read-only, which I believe is not a problem if the user changes DISTDIR and PKGDIR in /etc/make.conf.

This method currently also has its share of problems:

  1. zsync is, according to the author, in alpha stage (although it works great already!).
    I believe gentoo users would be an excellent testing crowd :wink:
  2. For low-disk space servers/routers it may not be possible to store a temporary 260 MB ISO.
    This can be circumvented by the fact that zsync allows to be piped the input file.
    So we could umount /usr/portage, compress iso file, pipe it to zsync (with zcat), remove the old compressed iso file and mount /usr/portage.
    This of course, would make the portage tree unavailable during synchronization.

This method, however, has the following advantages:

  1. It uses the network-efficient rsync algorithm.
    The current tests performed by the author suggest its bandwidth usage is more or less comparable to rsync.
  2. Works behind restrictive firewalls, as it only uses HTTP.
  3. It is much lighter on the servers.
  4. The disk usage pattern is very efficient, because it works sequentially on only 2 files.
  5. zsync is made in "simple" C, which is easily portable to all architectures (it already works for AMD64, at the very least) and fast.
    And it's being actively developed.
  6. It's simple!

--------------------------------------------------------------------------------------------------------

So if you think this is a worthy idea, how about setting up an experimental zsync portage mirror?
I would do this myself, but my upstream bandwidth is *very* limited :(

If you are willing to give it a try, here's a preliminary HOWTO:

Necessary steps to make the portage tree available on the mirror:

  1. emerge zsync.
  2. mkdir /var/www/localhost/htdocs/portsync (modify as necessary).

Then periodically:
Code:
emerge sync

cd /var/www/localhost/htdocs/portsync

mkisofs -r -x distfiles -x packages /usr/portage | gzip --best > portage.iso.tmp.gz
zsyncmake -C -b 4096 -u http://www.mirror-website.xyz/portsync/portage.iso.gz portage.iso.tmp.gz

mv portage.iso.tmp.zsync portage.iso.zsync
mv portage.iso.tmp.gz portage.iso.gz

Modify as necessary, specifically the /var/www/... directory and the http:// website URL.

Just make sure you do that at least once per day, and then.. post the .zsync URL here!

Gentoo users would only have to do the following to use this mirror:

  1. emerge zsync.
  2. Change DISTDIR and PKGDIR in /etc/make.conf
  3. mkdir /var/lib/portsync

Then periodically:
Code:
cd /var/lib/portsync

zsync -o portage.iso -k portage.iso.zsync http://www.mirror-website.xyz/portsync/portage.iso.zsync

umount /usr/portage
rm portage.iso.zs-old
mount -o ro,loop /var/lib/portsync/portage.iso /usr/portage
emerge metadata


Of course, this could use error checking (like /usr/portage being in use) and automation (bash scripts anyone?).
And emerge metadata would only be necessary if wget actually retrieves a newer file.

Also, let me be clear that I'm not saying we should replace the rsync mirrors (yet) :P
But some experimental zsync mirrors would be great :D

So... what do you think? Comments and suggestions please 8)

Update: Some cosmetic changes :)
Update: zsync 0.3.2 released
Update: Client commands updated: If the client retrieves the .zsync files with wget -N, it only downloads anything from the server if there's actually a newer file! Reduced bandwidth and reduced time :)
Update: zsync 0.3.3 released, wget no longer necessary: builtin -k option available; :wink: added -C to zsyncmake


Last edited by wizeman on Wed Mar 30, 2005 10:43 pm; edited 4 times in total
Back to top
View user's profile Send private message
garfield
n00b
n00b


Joined: 03 Oct 2003
Posts: 14
Location: Aalborg, Denmark

PostPosted: Tue Mar 22, 2005 4:20 pm    Post subject: Reply with quote

This is interesting, but I'm sure it will help your course if you provide data about
- time for sync of entire tree
- bandwidth for sync of entire tree (use jnettop or the like to measure this)
- load during sync on (most importantly) server and (less importantly) client
for both the rsync and zsync solution. Numbers for websync could also be included.
You link to some tests, but I didn't see tests that was comparable to a portage sync (+100k small files, and ½-2k changing per day). Also the rsync was run through a tunnel I believe. And he talks about compression, but it wasn't very clear to me what it means in the end (read: I just browsed his page quickly). Finally, it would be nice to know more about how well the server could maintain an updated database.
Back to top
View user's profile Send private message
wizeman
n00b
n00b


Joined: 11 May 2004
Posts: 12
Location: Portugal

PostPosted: Tue Mar 22, 2005 5:11 pm    Post subject: Reply with quote

Well, I would provide the data but.. I would need someone to setup a zsync mirror (with good bandwidth) for testing :P

The zsync author even offered to setup a .zsync of the portage tarball on his server, but that wouldn't work as well as this, because users would have to remove the entire portage tree and extract the new tarball.

The ISO way would be much better, as it can be mounted directly (no extraction/disk trashing need).

However, the author is more of a debian/freebsd user himself, so he can't sync the portage tree at least daily.. :twisted:

Also some clarifications to the original post may be necessary:

Users would only have to download the changed portions of the ISO, compressed . Some people didn't quite understand this.
Worst case scenario (user doesn't have the portage tree) would only be a 27 MB download. Of course, if only 800 KB of the uncompressed ISO was changed, then only the compressed portions that make up that 800 KB would be necessary to be download. This is because the .zsync file has an internal mapping of the uncompressed to the compressed portions.
Someone mentioned the ordering of the ISO file creation.. that might make users download a bit more, but I think it's possible to sort the files during ISO creation, if I'm not mistaken.
Anyway, don't forget the rsync algorithm works on overlapping blocks, so as long the content is there, it should be possible to recreate it in another place without downloading.

Also another advantage would be speed. Think how fast disks work sequentially as opposed to having to work on 110,000 scattered files!

But of course, only if someone sets this up we'll be able to see how well it works.. *hint* *hint* ;)

It has the potential to be much better than *at least* emerge-webrsync.. so I don't see the problem :P
Back to top
View user's profile Send private message
Betelgeuse
Developer
Developer


Joined: 10 Aug 2004
Posts: 12
Location: Finland

PostPosted: Tue Mar 22, 2005 6:23 pm    Post subject: Reply with quote

Is 1 mbit enough? I can set this up for a few users. You can find me as Betelgeuse@freenode.
Back to top
View user's profile Send private message
Betelgeuse
Developer
Developer


Joined: 10 Aug 2004
Posts: 12
Location: Finland

PostPosted: Tue Mar 22, 2005 7:21 pm    Post subject: Reply with quote

Ok. I now have a working server. I made a cron job for updating the server.
http://a.bo.cx/gentoo/sync.sh
It requires zsync and cdrtools. I use esync from esearch to update the tree but one could also use emerge sync.
Back to top
View user's profile Send private message
wizeman
n00b
n00b


Joined: 11 May 2004
Posts: 12
Location: Portugal

PostPosted: Tue Mar 22, 2005 7:32 pm    Post subject: Reply with quote

Ok, thanks to Betelgeuse, I have some field results: :D

I created an ISO of my portage tree which is updated as of 2005/03/20 01:36.
Betelgeuse 'emerge sync'ed his tree as of now - 2005/03/22 19:00 (here) and then created a new ISO and a new .zsync file.

This is the result:
Code:

wizy portsync # /usr/bin/time -v zsync -o portage.iso http://.../portsync/portage.iso.zsync
reading seed file portage.iso: *******************************************************************
****************************************************************************************************
*************************************************************************************************************
downloading from http://.../portsync/portage.iso.gz:.....
hashhit 8718704, weakhit 54444, checksummed 55352, stronghit 54801
verifying download...checksum matches OK
used 223760384 local, fetched 5252772
        Command being timed: "zsync -o portage.iso http://.../portsync/portage.iso.zsync"
        User time (seconds): 10.35
        System time (seconds): 1.60
        Percent of CPU this job got: 5%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 3:51.16
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 0
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 2
        Minor (reclaiming a frame) page faults: 1813
        Voluntary context switches: 6345
        Involuntary context switches: 7555
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

So as you can see, it took less than 4 minutes of real time on my 512Kb ADSL connection.
Let me just say that this was *a lot* faster than an rsync, never mind an emerge-webrsync.. (which also needs a rsync).

Notice that the total download was about 5.7 MB (including the .rsync file).
This is more than a rsync, but it's much less than an emerge-webrsync.

This can also be because my ISO file wasn't ordered the same way as Betelgeuse's, because mkisofs probably orders the files as they are on disk.
So the download size might be bigger than it's necessary because it must download the directory entries of the ISO file.
Even the way it is now, I think it's good enough.

Skimming through mkisofs's manpage, I can see there's an option to sort file *data* based on the file name/path (we could sort it alphabetically).
This might provide an improvement on the download size. But anyway it's already better than emerge-webrsync, and just as easy..

Also notice that zsync is still in the early stages of development, there's still room for improvement and possible optimizations!

We'll see how it goes the next few days :)


Last edited by wizeman on Tue Mar 22, 2005 8:06 pm; edited 1 time in total
Back to top
View user's profile Send private message
Betelgeuse
Developer
Developer


Joined: 10 Aug 2004
Posts: 12
Location: Finland

PostPosted: Tue Mar 22, 2005 7:33 pm    Post subject: Reply with quote

I only have 1 mbit upstream so I don't want tons of users. I can provide the url to users on request.
Back to top
View user's profile Send private message
wizeman
n00b
n00b


Joined: 11 May 2004
Posts: 12
Location: Portugal

PostPosted: Tue Mar 22, 2005 8:05 pm    Post subject: Reply with quote

As Betelgeuse suggested, I'm going to provide more comparisons between the different sync methods, including the time it takes to 'emerge metadata' where necessary (I was forgetting about this part :roll: ).

Between the tests I will reset /usr/portage to a snapshot of 20050319 (and emerge metadata).

Of course these results are only empirical, and it can vary a lot between different systems, however it can be a good comparison.

Lets see how it goes 8)
Back to top
View user's profile Send private message
jd5419
Tux's lil' helper
Tux's lil' helper


Joined: 26 Apr 2004
Posts: 110
Location: RI, USA

PostPosted: Tue Mar 22, 2005 9:20 pm    Post subject: Reply with quote

I have a server in california with a pretty fast upstream if you want i can setup some zrsync stuff just tell me how and or i'll give you a shell to setup whatever necessary... rsync is kind of a pain here too.
Back to top
View user's profile Send private message
Betelgeuse
Developer
Developer


Joined: 10 Aug 2004
Posts: 12
Location: Finland

PostPosted: Tue Mar 22, 2005 9:53 pm    Post subject: Reply with quote

jd5419 wrote:
I have a server in california with a pretty fast upstream if you want i can setup some zrsync stuff just tell me how and or i'll give you a shell to setup whatever necessary... rsync is kind of a pain here too.


Just use the script I provided. It can be run manually or via cron for daily runs. Inside the script you can specify where the files go. Default is /var/www/localhost/htdocs/portsync
Back to top
View user's profile Send private message
dufnutz
Apprentice
Apprentice


Joined: 01 May 2002
Posts: 209

PostPosted: Tue Mar 22, 2005 11:46 pm    Post subject: Reply with quote

Have you submitted this idea to buzilla with a link to this forum post? If you have I would like to read the bug.
Back to top
View user's profile Send private message
wizeman
n00b
n00b


Joined: 11 May 2004
Posts: 12
Location: Portugal

PostPosted: Tue Mar 22, 2005 11:57 pm    Post subject: Reply with quote

I didn't submit a bug because I thought it would be more appropriate to discuss here first.

But I submitted an email to the gentoo-dev mailing list, and I received some answers.
You may read the archive if you're interested :)

However, I'm having trouble responding to the mailing list, it seems my mails aren't getting through..
Back to top
View user's profile Send private message
wizeman
n00b
n00b


Joined: 11 May 2004
Posts: 12
Location: Portugal

PostPosted: Wed Mar 23, 2005 3:45 am    Post subject: Reply with quote

Ok, here are some figures.
Remember, this is only to have an idea of the time it takes on my PC, it can vary a lot among systems!

My machine is an AMD Athlon 64, with 1 GB RAM, with a 120 GB SATA disk with the XFS filesystem.
The filesystem of /usr/portage, however, is Reiserfs.
Bandwidth usage was measured with jnettop.

Before each one of these tests, I did:
Code:

rm -rf /usr/portage/*
cd /usr && tar -jxf /tmp/portage-20050319.tar.bz2
emerge metadata
sync


Here are some numbers for you:

emerge-webrsync:
Code:

00:47:45 - Started
01:05:36 - Download of the tarball finished [Download: 19.2M, Upload: 615K]
           Extraction of the tarball started
01:09:58 - Extraction of the tarball finished, local rsync started
01:10:45 - Local rsync finished, cleaning (rm -rf /var/tmp/emerge-webrsync/portage) started
01:12:18 - Cleaning finished, emerge metadata started
01:14:19 - Finished


Intermediate times were approximately measured (meaning, I just looked at the clock).
So, basically it took me 18 minutes to download the tarball and a little less than 9 minutes to synchronize the tree.

For zsync, first I created the ISO of the portage tree snapshot of 20050319 and put it in /var/lib/portsync.
Then I used this script:
Code:

date
/usr/bin/time -v zsync -o portage.iso http://.../portsync/portage.iso.zsync
mount -o ro,loop /var/lib/portsync/portage.iso /usr/portage
date
/usr/bin/time -v emerge metadata
date


zsync:
Code:

01:41:09 - Started
01:44:14 - zsync finished, [Download: 6.0M, Upload: 238K]
           emerge metadata started
01:46:40 - Finished


So, as you can see, it took me 3 minutes to synchronize the ISO file and 2.5 minutes to emerge metadata.

Please beware that the download speed for the zsync was roughly 2 times as fast as my download speed for emerge-webrsync!
So don't can't compare the figures directly :)

emerge sync:
Code:

02:03:12 - Started
03:39:00 - Still running... lol


Actually, I'm having problems with rsync, it seems my ISP doesn't like it so much this week..
So if anyone wants to give some numbers, go right ahead.. :D

A few more things:

For those interested, apparently the ISO files must have the directories sorted alphabetically (the standard says so, I believe).
Only file data is unsorted (but it can be sorted through mkisofs --sort option).
Back to top
View user's profile Send private message
wizeman
n00b
n00b


Joined: 11 May 2004
Posts: 12
Location: Portugal

PostPosted: Wed Mar 23, 2005 6:06 am    Post subject: Reply with quote

If all goes well, we should have a working zsync portage mirror tomorrow for available for all gentoo users. 8)

After that, I'll submit the zsync ebuilds to bugzilla and then I'll start to work on an easy-to-use emerge-zsync script for anyone to use :)

I'll keep you posted.
Back to top
View user's profile Send private message
dufnutz
Apprentice
Apprentice


Joined: 01 May 2002
Posts: 209

PostPosted: Wed Mar 23, 2005 4:56 pm    Post subject: Reply with quote

perhaps you should edit your first post to reflect the --sort option to be passed to mkisofs
Back to top
View user's profile Send private message
wizeman
n00b
n00b


Joined: 11 May 2004
Posts: 12
Location: Portugal

PostPosted: Wed Mar 23, 2005 6:22 pm    Post subject: Reply with quote

The mkisofs --sort option requires a list of the files and it's weights (it has to be precomputed). :?
I'm not sure if it's worth the trouble, especially if one always uses the same mirror to sync.
I think that may require more analysis. ;)

Anyway, the zsync-0.3.2 ebuild (bug #86406) is submitted.
Back to top
View user's profile Send private message
cphipps
n00b
n00b


Joined: 23 Mar 2005
Posts: 1

PostPosted: Wed Mar 23, 2005 8:33 pm    Post subject: comment from the author Reply with quote

garfield wrote:
This is interesting, but I'm sure it will help your course if you provide data about
- time for sync of entire tree
- bandwidth for sync of entire tree (use jnettop or the like to measure this)
- load during sync on (most importantly) server and (less importantly) client



Load on the server is minimal - certainly far less than rsync (probably hardly more load than a normal HTTP download, but I don't have solid figures). Load on the client is higher; for a 200meg ISO, perhaps 1-2 minutes of CPU time while it runs the rsync algorithm. Bandwidth should be whatever the best HTTP transfer speed from server to client is.

garfield wrote:

You link to some tests, but I didn't see tests that was comparable to a portage sync (+100k small files, and ½-2k changing per day). Also the rsync was run through a tunnel I believe. And he talks about compression, but it wasn't very clear to me what it means in the end (read: I just browsed his page quickly).


I'm the author, so I can say the paper is unclear because the results are not yet clear - compression is a big saving but I'm not sure I've squeezed optimal performance out of either gzip --rsync or zsync's method yet. Roughly speaking it seems that zsync with compression is about as good as zsync without compression when there are very few changes, and is much more efficient when there are large changes. For small changes rsync with compression still beats zsync by a good margin. zsync is meant to extend the rsync idea to http downloads and areas where low server load is critical: it definitely does not challenge rsync in applications where a dedicated server is available.

I myself don't have figures for transferring an ISO populated with lots of small files, but it looks like they are already trying it out with some success - I'm awaiting more figures with interest :-)

zsync is at an alpha stage, but is starting to stabilise; and this at least has the advantage that it's at an early enough stage that you can ask for features and shape its development.
Back to top
View user's profile Send private message
tuxp3
n00b
n00b


Joined: 28 May 2004
Posts: 61

PostPosted: Wed Mar 23, 2005 8:51 pm    Post subject: Reply with quote

I like the idea - but im not too familiar with how ISOs work (like the data layout)

can an ISO be mounted rw? or would it always be ro? just woundering

also how does rsync handle the iso? i mean how much "extra" information would be downloaded verses how much information would actually need to be downloaded. aka size of downloaded ebuilds vs. iso differences- also would it be possible to compress all this data onthefly?

just some thoughts after reading this thread.

i really like the idea tho - right now i have a seperate partition for portage just because of the sheer number of files

Thanks, Tux
Back to top
View user's profile Send private message
wizeman
n00b
n00b


Joined: 11 May 2004
Posts: 12
Location: Portugal

PostPosted: Wed Mar 23, 2005 10:00 pm    Post subject: Reply with quote

Well, ISOs must be mounted ro.

About how much more information it transfers (being an ISO), I don't really know. :roll:

But about compressed transfers.. it could be possible to store the ISO file uncompressed on the server, and then compress the transfers with the Apache mod_gzip, but perhaps only if mod_gzip compresses multi-part, pipelined ranged requests.

That's another thing we could try, if/when zsync supports it :)
Back to top
View user's profile Send private message
tuxp3
n00b
n00b


Joined: 28 May 2004
Posts: 61

PostPosted: Thu Mar 24, 2005 4:18 am    Post subject: Reply with quote

see as much as i think ur idea rocks - i just see a few too many fallbacks of the ISO's being only able to be mounted ro - this is in a way "ineffiecnt" if we had another "loop" FS that was as light as iso (very similar) but r/w (maybe UDF or something) then i could see some really big advantages
forexample - the server would just append the new ebuild data to the end - and this would make the rsync even more effient - it just scans and then 'oh look more data that what i already have " and DL's the data - not as much searching around to figure out changes...

but then again im sure that has drawbacks as well /me wishes ram was cheaper and portage was smaller :-P (RAM DISK lol)

but if u could get compression to work with zsync it would be freakin awesome - can ISO's be set to compress the data as its added to it upon creation? maybe that could be a work around.. but indeed it would hurt performance for clients when queuing the DB

Thanks, Tux
Back to top
View user's profile Send private message
kimchi_sg
Advocate
Advocate


Joined: 26 Nov 2004
Posts: 2915
Location: Singapore

PostPosted: Thu Mar 24, 2005 4:38 am    Post subject: Reply with quote

Someone make genone read this thread please... This sounds great, but if it is ever to make its way into portage, a portage dev must see this. And genone's the one portage dev who goes around the forums most often. ;)

@wizeman: File a bug on http://bugs.gentoo.org too. Then they will have to give this a look.
_________________
Murphy's Law of Gentoo installation: If a compile can fail, it will.

MacGillicuddy's Corollary: At the most inopportune time.

Please search and read the FAQs before posting.
Back to top
View user's profile Send private message
wizeman
n00b
n00b


Joined: 11 May 2004
Posts: 12
Location: Portugal

PostPosted: Thu Mar 24, 2005 9:58 pm    Post subject: Reply with quote

Just to let you know, today we've been making some tests with different storage formats and compression.

I used 2 portage trees with about a 24-hour difference to make some comparisons, trying mainly to reduce the necessary bandwidth to synchronize.

A compressed ISO file would take about 4.5M + 0.7M (for the .zsync) of bandwidth.

The best we've come so far, is to use an uncompressed squashfs (88M).
If we use gzip to compress the uncompressed squashfs on the server, it becomes 21M in size (with a 0.5M .zsync file, if we use 2048-byte blocks).

Then, the client synchronizes with zsync, using only (in this test) 2.5M + 0.5M of bandwidth.
So that's already a nice improvement - about 3 MB of bandwidth versus the 5.2 MB necessary for the ISO, and a much smaller file in the client :)
Of course if you consider the about 18-19 MB of bandwidth or so for emerge-webrsync, it looks even better :lol:

The problem is that not everyone has squashfs compiled in the kernel. However, karltk told me the latest gentoo kernel sources has it/will have it integrated.
But we don't want to force anyone to use a non-standard filesystem, so even if we use squashfs it will be optional, of course. :)

Meanwhile, karltk is setting up the server so you'll have to wait until he can be sure the scripts are working fine ;)

BTW, I don't want to hassle the portage devs right now more than is necessary. I've already brought this subject up on the IRC channels, and I've sent a mail to the gentoo-dev mailing list, that should be enough for now :P

When people start using it more and we can be sure everything is working right, then we should discuss about including it in portage :)
Back to top
View user's profile Send private message
tuxp3
n00b
n00b


Joined: 28 May 2004
Posts: 61

PostPosted: Sat Mar 26, 2005 11:27 pm    Post subject: Reply with quote

i just thought of something extremely verstile : torrents - torrents are made to "fill in the missing data" and such like that and its all hash checks so u can be assured that ur ISO matchs what everyone else has - maybe a branch of exisiting torrent code and tracker would make this a vaable solution - the only thing is : will torrents be able to " make room" for new data without trashing too much data thats going to end up being redownloaded - i would think this is possible if we use very small piece sizes... but maybe this zsync is a better solutiion - but eitherway this could really help reduce the load on the rsync servers - because we copuld make it default (changable of course) to allocate some of the user's bandwidth anytime emerge is running and since most users do emerge -avuD right after emerge sync - this could make it very easy to 'repay' the leecher pool.. but this could go all wrong if no one shares.. but eitherway it would centralize the main portage DB a little bit more... the torrents them selves would have to be automagicly updated when the user starts the sync - but besides that the rsync servers could be turned into seeders.. cron them to fetch a new torrent every half hour..

just some thoughts as i make some more free space as i open up some torrents :)

Tuxp3
Back to top
View user's profile Send private message
ssvb
Tux's lil' helper
Tux's lil' helper


Joined: 06 Nov 2003
Posts: 96

PostPosted: Sun Mar 27, 2005 11:02 am    Post subject: Reply with quote

One more way to reduce bandwidth is to use delta patches instead of rsync algorithm. So the server can create iso snapshots every day and create delta patches that convert one snapshot into another. The user will only have to download appropriate patch and reconstruct the most up to date portage image using just this patch without downloading anything else. One drawback here is that if the user does not sync for a long time, he may not find a proper patch for his outdated portage snapshot. Anyway, please try to use gzip compressed bdelta patch in your experiments just for comparison to see if this could theoretically save bandwidth and time when comparing with rsync and zsync.
Back to top
View user's profile Send private message
danhan
n00b
n00b


Joined: 03 Nov 2003
Posts: 2
Location: Frankfurt - Germany

PostPosted: Mon Mar 28, 2005 8:46 am    Post subject: rsync server on tcp/443 for better ssl proxy compatibility Reply with quote

Hi Community,

I've posted another idea for the "rsync problems" some threads ago.

>What' s against running some of the rsync mirror servers on tcp/443 instead of rsyncs default port.
>TCP/443 is allowed in most environments for "METHOD CONNECT" on the proxies.

http://forums.gentoo.org/viewtopic-t-102461-highlight-rsync+tcp+443.html

--daniel
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Unsupported Software All times are GMT
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum