Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Fun With Distcc
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
unheatedgarage
n00b
n00b


Joined: 19 Sep 2016
Posts: 34

PostPosted: Wed Jul 04, 2018 2:28 pm    Post subject: Fun With Distcc Reply with quote

Edit: Yeah, no, this was dumb. That's not how this works--THAT'S NOT HOW ANY OF THIS WORKS:

Conclusion: I've had best results with setting --randomize and multiplying every host's processor by two (over distcc's recomendations) in /etc/distcc/hosts. It makes a big difference to remember to double-up the number of potential jobs [considering the number of each helper's cores] in your /etc/distcc/hosts file, as well as your make.conf. I'm also not afraid to do emerge -uDNavt @world at the same time between computers, as they're all doing their own things at their own times. It's rare that they clash, the one time being one of them decided to build clang and rust at the same time. All it did was slow things down--in the end there were no errors.

Questions: Does it really matter how many -jobs I set in MAKEOPTS so long as my --load is sane? Couldn't I just throw maximun number of jobs and let the load setting manage it?

I also set EMERGE_DEFAULT_OPTS="--jobs 2 --load-average 0.9". So I'm thinking so long as the load average isn't over 0.9 that it won't try to start emerging another package, but if it does, will it send the same amount of jobs as set in MAKEOPTS for another package? In other words: would the computer then be sending out double the jobs it was sending before? On top of the first emerged/emerging package?

Sometimes I'm just fit-to-be-tied over how intelligent Portage is. I mean, really? Cool!

For as long as I've read (not that long) its been dogma for us to set our -j* settings to double whatever cores we have running if we're using Distcc. Assuming we all have -zeroconf in our make.conf, here's how I go about setting up my build cluster for maximum CPU saturation:

Our own Neddy has stated (paraphrasing) that computers spend most of their time waiting for user input. So in the interest of economizing through-put, I place here my questions on why the man pages of distcc tout such great increases in build-time for our computers, yet my own results (out-of-the-box) have been so thoroughly mediocre.

Setting it up as per the instructions, everything hums along nicely, but I've wondered why (if my cranky old single core) is sending jobs out to all the other computers, it's loaded up so much, yet my other honking fast (for me) quad core is getting in barely a trickle of work to process

Lets assume I have a teensy little single core sending out signals to a hulking quad core. Standard operating procedure would have you set your MAKEOPTS at "-j10" and then walk away happy, but in the distcc man pages it states that:

Quote:
/LIMIT
A decimal limit can be added to any host specification to restrict the number of jobs that this client will send to the machine. The limit defaults to four per host (two for localhost), but may be further restricted by the server. You should only need to increase this for servers with more than two processors.


So I set my /etc/distcc/hosts to look like this:

tiny-core quad-core --randomize

Hurray! We're talking! Lights are flashing on the router and things are diddling. But...snooze...things really arent' building faster, and my quad-cores cpus are just playing with themselves--no real computation going on here, so lets' try this:

quad-core

Nope, not really much difference, but the Tiny Core is still loaded sending jobs

Fine. The man pages say each helper only uses four jobs by default, so lets crank that up:

quad-core/8

NOW WERE TALKING. Full usage of all the cores, meanwhile tiny-core is able to happily hum along sending jobs.

In this case, cranking up the jobs sent to quad core made a huuuuge difference. Also, removing tiny-core (localost) also improved things--giving tiny-core the much needed resources it needed to run the distcc daemon and all it's pre-op work.

Every case is different. If I were running the same job from my quad-core, I might exclude the tiny-core, but more than likely I would keep it included,
Or you could just run Distcc in simple compression mode. I've run tests to compare the same package, with the same computers, with different settings, and the results are really interesting if you're into statistics. :)

The distcc defaults are four jobs per-host, but I have four processors (five in this case--quite a bit more depending on what's plugged into the network) and they say you can double-up on your MAKEOPTS. Doubling up on your MAKEOPTS doesn't hurt things, but it's doing jack-squat if you're not sending those extra jobs to your helpers.

But if we were to add in a couple of more computers into the mix and want to update them all at once? Can you do that? YOU'RE DAMN RIGHT YOU CAN.

All of these procesors process things at a different time and pace, so if tiny-core is still thinking about one job while it's at only %40, while quad-core is sending out a job to it, tiny core can handle it. Meanwhile our other computers are also processing and sending/recieving their jobs--what ends up is no one computer is overloaded, because we've set our EMERGE_DEFAULT_OPTS="--jobs 2 --load-average 0.9" (or whatever's sane for that computer's processor). Setting your load average per computer is important too, that way if it's working hard doing a job for one computer it won't start another job for itself, and this goes vice/versa for all the other 'puters on the network too.

I really have no idea how all this works, but it's fascinating and I want to learn more!

I've seen it with my own eyes, by god. I sent my cluster off running a full --empty-tree, and for the most part it was a-ok. As a user who's tried to break it, I find Distcc to be extremely robust and flexible--having changed hosts, added/removed pump and/or compression on-the-fly without a hiccup. The program is great by its defaults, but once you get it tweaked, it's stunning.

But then, wondering if these computers all working together to bring each other up to snuff, and considering something about the tide and how it brings all the other boats up, or how the slowest computer in the chain holds everyone else back...doesn't it all just work out to be the same? I can't help but think this whole thing is nothing but mental masturbation, but hey, just like jerking off; it's fun, it feels good, and it hurts no one.

Happy computing, everyone!
_________________
I'm not even mad; I'm impressed!


Last edited by unheatedgarage on Sat Jul 14, 2018 1:03 am; edited 1 time in total
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 40979
Location: 56N 3W

PostPosted: Wed Jul 04, 2018 7:38 pm    Post subject: Reply with quote

unheatedgarage,

unheatedgarage wrote:
So I set my /etc/distcc/hosts to look like this:

tiny-core quad-core --randomize


The order in which hosts appear in /etc/distcc/hosts is important.
With hostA/8 hostB/8 ... hostZ/8, until hostA has 8 jobs running, nothing will go to hostB.
Until both A and B are busy, nothing will go to hostC, so localhost should not appear in /etc/distcc/hosts or if it does, it should be at the end.
If --randomize helps throughput. you are sending too many jobs to your helpers as all it does is spread the load.
That might be a bit simplistic as the helpers will be doing their own thing too.

locathost still has to do the configuration, linking and possibly the preprocessing too.
With distcc, it has to keep the network busy and in pump mode, run compression, so distcc is not all gains.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
unheatedgarage
n00b
n00b


Joined: 19 Sep 2016
Posts: 34

PostPosted: Thu Jul 05, 2018 12:54 am    Post subject: Reply with quote

That's good to know. I'll keep playing with it and see what works best. Thanks, Neddy!
_________________
I'm not even mad; I'm impressed!
Back to top
View user's profile Send private message
johngalt
Apprentice
Apprentice


Joined: 09 Sep 2004
Posts: 240
Location: 3rd Rock

PostPosted: Thu Jul 05, 2018 5:37 am    Post subject: Reply with quote

@unheatedgarage - Thanks for this post. My Core i7 965 EE has some work to do for my two pitiful laptops lol.

@Neddy - as always, thanks for your fixes and explanations.
_________________
desultory wrote:
If you want to retain credibility as a functional adult; when you are told that you are acting boorishly, the correct response is to consider that possibility and act accordingly to correct that behavior.


Amen.
Back to top
View user's profile Send private message
R0b0t1
Apprentice
Apprentice


Joined: 05 Jun 2008
Posts: 255

PostPosted: Fri Jul 06, 2018 2:22 am    Post subject: Reply with quote

What is the average cores to GB of RAM ratio in your cluster? Is there a way to track machine utilization?

I have been wanting to set up a distcc farm on cheap ARM boards instead of buying a server.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 40979
Location: 56N 3W

PostPosted: Fri Jul 06, 2018 4:29 pm    Post subject: Reply with quote

R0b0t1,

You need about 1GB/core for some C++ code, which is why its not possible to build some things on small ARM borads.
On the other hand, is that very much worst case.
With 96 cores working hard, I've not got over 60G RAM used, which suggests the average RAM/core is much lower.

Lots of small RAM ARM boards won't help you build the things that need 1G RAM per core.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
unheatedgarage
n00b
n00b


Joined: 19 Sep 2016
Posts: 34

PostPosted: Sat Jul 14, 2018 12:59 am    Post subject: Reply with quote

Nevermind! Don't do that! This was a bad idea from the beginning, and I should feel bad for having thought of it--this is why we can't have nice things!

I write this from a computer that's herking-and-jerking from swapping too much. I set both my laptop and my helper to do some building and it just about crashed everything (thank goodness for large swap partitions!), and I have been experiencing these issues just about ever since I made the first post.

HOW EMBARASSING

So it's back to the sane distcc defaults for me--no more overloading everything.

Hope I didn't crash anyone elses computers. :oops:

Happy distributing, everyone!
_________________
I'm not even mad; I'm impressed!
Back to top
View user's profile Send private message
unheatedgarage
n00b
n00b


Joined: 19 Sep 2016
Posts: 34

PostPosted: Sat Jul 14, 2018 4:00 am    Post subject: Reply with quote

In this case, say my helper: tiny-core is set to

MAKEOPTS="-J100 -l1"
EMERGE_DEFAULT_OPTS=" --jobs 100 --load-average 0.9"

It was my hope that Portage would be smart enough to know to balance things based on the "-l" and "--load-average" settings in MAKEOPTS and EMERGE_DEFAULT_OPTS, respectively, but this doesn't seem to be so.

As we speak, tiny-core is stuck grinding into swap trying to emerge dev-libs/popt and sys-libs/glibc at the same time, so it seems likely that emerging more than job at once (depending on the size and intensity of the jobs) will overload the processor, and portage won't be able to keep the number of jobs set in MAKEOPTS to a sane level.

When my desktop quad-core was seizing up earlier, it was recieving jobs from both tiny-core and laptop. Not sure which packages they were emerging at the time, but it was definitely enough to clog up quad's tubes and bring things to a near stand-still.

So, if I were to continue setting both tiny-core and laptop to build at the same time (with both sending jobs to quad-core), it certainly makes sense to keep distcc at its defaults.

But...

Maybe I could keep the not-so-sane settings I was talking about earlier in both tiny-core and laptop's /etc/distcc/hosts, but make sure to not set EMERGE_DEFAULT_OPTS at more than one --job, maybe then that wouldn't completely overrun either quad-core, or tiny-core/laptop.

Note: I'm no longer running localhost in either tiny-core or laptop, and they're just sending jobs to quad-core, not each other.

I like the idea of Portage beginning to process another package while the system is building another (as long as it's not overloaded), that's why I was all for setting more than one job in EMERGE_DEFAULT_OPTS, once I read about it, but if Portage is not meant (or not capable of) dealing with multiple jobs and distcc, then it's all for naught.

Well anyway, it's a blast--having a ball. I'm still on the path to get my computer science degree and become a Gentoo developer, it's just a long (expensive) path. Meanwhile, I'm standing on the shoulders of giants, and I know that you juggernauts are here to help me along.

It's funny how when your Windows system breaks, and you install Ubuntu as a patch (all the while proclaiming you hate computers and just want them to "work"), things flip. You spend your life wrenching on cars and motorcycles, then you go install one system--you do ONE THING, and the next thing you know your path and interests have completely changed.

And, well, here we are.

Thanks, everyone. Thanks everyone so much.
_________________
I'm not even mad; I'm impressed!
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 40979
Location: 56N 3W

PostPosted: Sat Jul 14, 2018 7:44 am    Post subject: Reply with quote

unheatedgarage,

The helpers make.conf is never consulted when its running distcc for helping other systems.
Its distccd listens for incoming jobs and runs them, regardless of make.conf.

Code:
MAKEOPTS="-J100"
is incorrect - its -j, not -J.
This sets the number of parallel jobs a single package can start. In this case', its as good as unlimited.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
unheatedgarage
n00b
n00b


Joined: 19 Sep 2016
Posts: 34

PostPosted: Sun Jul 15, 2018 5:09 am    Post subject: Reply with quote

Mr. Seagoon,

Thank you, sir! As always, I place your input on the highest pedistal (you have no idea how much you've helped me over the years).

Yessir, that MAKEOPTS was a typo, rest assured it was correct in the make.conf--thank you very much for your observations!

I've found, using these absurd MAKEOPTS, glibc does not respect the load setting, at least not while it's installing. While it's building all seems all is well, but if I keep MAKEOPTS at -j100 -l1, the installation goes haywire--load spikes, swap starts, it's a mess. But if I set -j1 (like any sane person would do), tiny-core goes ahead and installs it (although it takes a while!).

What I ended up doing was setting a build environment for glibc called sane-makeopts and pointing glibc to it in package.env.

What I'm still curious about, though is: how does Portage treat two packages. How does Portage two-fist two packages, and how does it treat the "-l" setting in MAKEOPTS and the "--load-average" in EMERGE_DEFAULT_OPTS?

What's the magic with parallel builds, Distcc, and Portage?

None of this is urgent--just curious. Should I just get on IRC (it frightens and confuses me) and ask these questions there?

I'm grateful for your time!

Edited for spelling.
_________________
I'm not even mad; I'm impressed!
Back to top
View user's profile Send private message
johngalt
Apprentice
Apprentice


Joined: 09 Sep 2004
Posts: 240
Location: 3rd Rock

PostPosted: Sun Jul 15, 2018 5:16 am    Post subject: Reply with quote

I asked a similar question in the Installing Gentoo forum regarding MAKEOPTS and EMERGE_DEFAULT_OPTS:

https://forums.gentoo.org/viewtopic-t-1083538.html

I've still not quite got my head wrapped around it, but I've resorted to MAKEOPTS="-j{#cores +1} -l{#cores} for CPU utilization and EMERGE_DEFAULT_OPTS="--jobs=2 --load-average=2 --with-bdeps=y" on my Core i7 965 Extreme (Nehalem, aka 1st gen core). So, in my case, it's -j9 -l8
_________________
desultory wrote:
If you want to retain credibility as a functional adult; when you are told that you are acting boorishly, the correct response is to consider that possibility and act accordingly to correct that behavior.


Amen.
Back to top
View user's profile Send private message
unheatedgarage
n00b
n00b


Joined: 19 Sep 2016
Posts: 34

PostPosted: Mon Jul 16, 2018 6:39 am    Post subject: Reply with quote

Jeez. I really screwed up by posting this, then, because your thread was pretty much the same thing as mine. Hu always gives great, thorough, answers.

Thanks, Hu!

Thank you for your reply johngalt.

Just discovered this tonight: I've been running --quiet-build for so long, I forgot to watch the output of each package. You know how some packages won't build with distcc-pump enabled? I had created a build environment just for them called disable-pump. In that I had:

Code:
FEATURES="distcc -distcc-pump"


This worked, but what I didn't realize (after watching the output of portage) was that distcc was failing to distribute.

You see, I thought I could leave
Code:
,cpp,lzo
enabled in /etc/distcc/hosts and Distcc wouldn't care about whether or not I had enabled (or disabled) pump or compression in my build environment.

That's wrong, wrong, wrong.

Anytime I set disable-pump, that just broke it unless I removed
Code:
,cpp
from each of the helpers in the hosts file. Then Distcc started to work each system again.

Turns out I've been sucking my own dick this entire time, without even having the pleasure of knowing it.
_________________
I'm not even mad; I'm impressed!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum