Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
multi threaded emerge- downloading while compiling
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Duplicate Threads
View previous topic :: View next topic  
Author Message
popel
n00b
n00b


Joined: 29 May 2004
Posts: 3

PostPosted: Sat May 29, 2004 6:26 pm    Post subject: multi threaded emerge- downloading while compiling Reply with quote

Is there any way for downloading and compiling at the same time?

Background:
I have a "smal" internet connection. The download of bigger packages needs often more than 5 minutes. By compiling while download i think i could reduce buildtime for my system to the half.
Back to top
View user's profile Send private message
fifo
Guru
Guru


Joined: 14 Jan 2003
Posts: 437

PostPosted: Sat May 29, 2004 6:31 pm    Post subject: Reply with quote

Yes, use emerge with the "-f" switch to do the downloading, then once it's downloaded the first package or two, start the actual emerge command along side it to do the compiling.
Back to top
View user's profile Send private message
popel
n00b
n00b


Joined: 29 May 2004
Posts: 3

PostPosted: Sat May 29, 2004 6:38 pm    Post subject: Reply with quote

Yes, i tried this.

but sometimes my cpu is just faster than the download. and then sometimes something ugly happens to the partly downloaded file: it gets corrupted.

there needs something to be build with monitoring, which checks whether the file is complete, else waits for it to complete.
Back to top
View user's profile Send private message
Nate_S
Guru
Guru


Joined: 18 Mar 2004
Posts: 414

PostPosted: Sat May 29, 2004 7:31 pm    Post subject: Reply with quote

this topic has been throughly discussed, I believe the problem is that portage has problems when there is more than one instance of it running at a time. There have been several workarounds proposed, take a look on the thread about this in bugzilla.
Back to top
View user's profile Send private message
robmoss
Retired Dev
Retired Dev


Joined: 27 May 2003
Posts: 2634
Location: Jesus College, Oxford

PostPosted: Sat May 29, 2004 8:36 pm    Post subject: Reply with quote

portage can have multiple instances running no problem now. But this is a different kettle of fish, it's actually pretty tricky.
_________________
Reality is for those who can't face Science Fiction.

emerge -U will kill your Gentoo
ecatmur, Lord of Portage Bash Scripts
Back to top
View user's profile Send private message
meowsqueak
Veteran
Veteran


Joined: 26 Aug 2003
Posts: 1549
Location: New Zealand

PostPosted: Sat Jul 03, 2004 2:32 am    Post subject: Reply with quote

What if emerge created a lock indicating 'file is being downloaded' and the other instance of emerge saw that lock and built other downloaded packages (as long as dependencies are met) or else blocked on the lock. You just need a flag for emerge that tells it not to download anything and wait on locks.

Then you can have one emerge running -f downloading the files in a suitable order (by dependency as it does it now would be fine) and another emerge running -?? that tells it to do as I describe above.

This wouldn't be perfect, since the optimal build time depends on optimising the availability of source packages to the building 'thread', and this would be tricky to estimate (use source tarball size maybe, if we could assume build time and certainly download time are proportional to file size?), but at least the method above would work quite well generally.
Back to top
View user's profile Send private message
robmoss
Retired Dev
Retired Dev


Joined: 27 May 2003
Posts: 2634
Location: Jesus College, Oxford

PostPosted: Sat Jul 03, 2004 6:25 am    Post subject: Reply with quote

Well you probably want to download things in the same order you're going to emerge them. But your idea is nice, if a little tricky to implement. Maybe one to have a go at once the Portage API shows up?
_________________
Reality is for those who can't face Science Fiction.

emerge -U will kill your Gentoo
ecatmur, Lord of Portage Bash Scripts
Back to top
View user's profile Send private message
meowsqueak
Veteran
Veteran


Joined: 26 Aug 2003
Posts: 1549
Location: New Zealand

PostPosted: Sat Jul 03, 2004 7:13 am    Post subject: Reply with quote

Well, I actually think it would be fairly simple to implement to start with - "oops, this package is 'download-locked' - we'll sit here and wait for it to finish". This will result in stop-start behaviour but it will still produce good results if you have a fast downlink. And it will prevent the problem of the second emerge catching up with the download. Since this already seems like a popular solution (two emerges, one downloading, one building started some time later) then waiting on 'download locks' will prevent the rather nasty effect of trying to download the same thing twice at the same time. People already do this (and in a way it's risky) so this could make it perfectly safe. After that, yes, it starts to get a little more complicated. We'd need a good heuristic to decide which package to download next for optimal performance (based on download speed, compilation speed, package size, maybe something that learns over time?).

Typing as I think: consider a package with size N bytes that takes D seconds to download and C seconds to compile (remember we often don't know C and D just yet - we'd have to make a good guess most of the time based on past history and N). If D << C then download this package early, since we'll have lots of time while it's building to download other packages. If D >> C then we won't get much benefit building this, so look for better cases first.

For the collection of packages, estimate C and D for each, and pick the one with the largest difference between C and D where D is less than C. Download this first.

This algorithm is 'greedy' and may not always work very well, since a). the heuristic estimation may fail horrendously or b). the first package you download may take so darn long to download you would have been better off to download everything else first and start compiling those and download the big package last.

Some packages contain a lot of non-compiled information (e.g. nvidia binary drivers and big tcl/tk or perl apps) so they may take a long time to download but are built in seconds. In this case using N as a metric isn't going to work too well. Might need some way of feeding back the success of the estimates back into the system so that the ebuilds contain this and the entire system learns... or something...

We could call it SkyNET... 8O
Back to top
View user's profile Send private message
robmoss
Retired Dev
Retired Dev


Joined: 27 May 2003
Posts: 2634
Location: Jesus College, Oxford

PostPosted: Sat Jul 03, 2004 7:40 am    Post subject: Reply with quote

It's very tricky, yes! But I think that a naive implementation would certainly be better than no implementation at all. I think it's probably more sensible to stick a flag in the ebuild which denotes whether or not a package is "pure source" - so gcc is, openoffice is, but openoffice-bin isn't, and neither is the nvidia stuff. You can stick those last for compiling. The others - I suspect that it would make more sense to order things in such a way that smallest things go first, biggest things go last, whilst not breaking the depgraph. Of course, we can't do this until we get a proper depgraph.
_________________
Reality is for those who can't face Science Fiction.

emerge -U will kill your Gentoo
ecatmur, Lord of Portage Bash Scripts
Back to top
View user's profile Send private message
tomk
Bodhisattva
Bodhisattva


Joined: 23 Sep 2003
Posts: 7221
Location: Sat in front of my computer

PostPosted: Sat Jul 03, 2004 10:12 am    Post subject: Reply with quote

Moved from Portage & Programming, please search before posting there is a FAQ about this:

https://forums.gentoo.org/viewtopic.php?t=30842
_________________
Search | Read | Answer | Report | Strip
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Duplicate Threads All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum