rant: portage taking ~10 minutes to compute deps

Message

eccerr0r · Post by **eccerr0r** » Fri Jan 10, 2020 9:02 am

Oh ... my ... gawd ...

You know things are bad when it takes a moderately fast machine almost 10 minutes to resolve a world update...

Sigh. I guess glad the complicated USE / mask / ... setup isn't on my Atom, at least the Atom has fewer packages installed.

Ant P. · Post by **Ant P.** » Fri Jan 10, 2020 10:25 am

Backtracking is a silent killer. Try it with --backtrack=0 and see what falls out, or run eix -u. That list should be short.

eccerr0r · Post by **eccerr0r** » Fri Jan 10, 2020 4:19 pm

Indeed it is backtracking, alas, too much crap in world that it needs to figure out a solution for...and it can't...

I think I finally got something that will --pretend emerge correctly after solving some dependencies for it, but it's painful to guess and check.

axl · Post by **axl** » Fri Jan 10, 2020 8:46 pm

well.

especially when you have multiple overlays... it will take longer. the bigger portage is, the longer it will take.

now... I have 2 pieces of advice if you want to speed up.

1. use the fastest python you have available. python 2.7 is way way way slower in my tests than any 3.x version. and even in 3.x version, test, test, test.

2. where do you put the relevant data.

I read a discussion on IRC. I wasn't present, and didn't participate, but those people were suggesting: ahhh.. tmpfs is not that fast. it's actually bad to use it. what if you run out of memory. what do you do then?!

First of all, I favor ramfs to tmpfs, which is tmpfs minus all that acl stuff, and second of all, it speeds up a lot of stuff.

so. you could put portage in ramfs. but also other repositories. could also merge repositories and remove as many ebuilds you dont need just to speed things up. like for instance if you don't have any package.accept_keywords with ** you could just delete all 9999 ebuilds.

so put repositories in ram. try to merge them. as much as possible. 5-6-7-8 repositories still gonna take time. merging is hard...

and finally, move /var/db/pkg in ram. which is also often "searched". i dont think portage starts with a list of cashed things it already has installed. it would use too much memory. so instead, for each individual check it will go scratch the hell out of var/db/pkg. and yes, also those people on irc believe that kernel cache will handle everything by itself.

I on the other hand believe that when I know its in ram, I don't have to guess. So... yeah. Each their own.

So /usr/portage, /var/db/repos, /var/db/pkg all moved to ramfs. python 3.6 and re-evaluate time. it wont be much better. just hopefully slightly...

Post by **Banana** » Fri Jan 10, 2020 9:21 pm

axl can you explain what is the difference between ramfs to tmpfs? I though using this https://wiki.gentoo.org/wiki/Portage_TMPDIR_on_tmpfs would move Portage TMPDIR into RAM?

Post by **asturm** » Fri Jan 10, 2020 9:24 pm

TMPDIR is for the actual build, it can in no way change anything about dependency calculation.

...whicn never exceeds a minute on my systems, so it would really be interesting what is causing that slowdown for some users.

axl · Post by **axl** » Fri Jan 10, 2020 9:33 pm

well, I started saying that ramfs doesn't have ACL. access control list. But then I asked myself what is that and couldn't answer.

When you read the kernel help... it really doesn't help:

POSIX Access Control Lists (ACLs) support permissions for users and groups beyond the owner/group/world scheme. I think it's what gentoo refers to as extended attributes of a filesystem. USE=xattr. I honestly NEVER had a use for it, so it's always useless overhead.

in terms of filesystems that reside on ram, ramfs is the one lacking extended atributes. tmpfs has them. obviously ramfs is faster.

other than that... the only other thing I could say is that ramfs is so hackish and just not a thing that it doesn't have a way to evaluate the size. like statfs(). that doesn't work on ramfs. it does on tmpfs. you can df a tmpfs. you can't... anything a ramfs. it simply has no notion that in itself is a fs. which makes it very fast.

axl · Post by **axl** » Fri Jan 10, 2020 9:40 pm

asturm wrote:TMPDIR is for the actual build, it can in no way change anything about dependency calculation.

...whicn never exceeds a minute on my systems, so it would really be interesting what is causing that slowdown for some users.

well, tmpdir for building environment.

but also ram for portage, all repos, and all textfiles in /var/db/pkg.

that ONE THREAD process emerge -NuvD --with-bdeps -j10 @world... is a single threaded process. executed by python.

and that process reads and interprets data from 3-4 places.

1. gentoo repo. where ever that is.
2. other repos.
3. etc confs.
4. var/db/pkg already installed packages.

it stands to reason that if you want the best out of emerge, you put this data into ram.

Post by **Ionen** » Fri Jan 10, 2020 9:41 pm

Dependency calculation is always the same thing for me, it's 1 single thread running at 100% for the entire time. Clearly it's not the speed it's reading from the disk that's stopping anything.

Multi-threaded dependency checks when?

Then again -uUD calculations taking like ~20 seconds right now, so it's not super bad (not that it has any major issue to resolve right now, but even then). Reaching 10 minutes sounds impressive. Using python 3.7 with pgo but I don't think it's making that big of a difference

Edit: maybe I should have the python ebuild run emerge for the profiling....

ChrisJumper · Post by **ChrisJumper** » Fri Jan 10, 2020 10:04 pm

Sometimes i have this too. Think its because of a large package.use and package.keywords List and a too fat World file. If my system needs too long for the answer i have to go back to stable packages with less Useflags in package.use and use make.conf for a global use.

I know my issue to make my gentoo a weighty tome, is the kind i try to fix issues with keywords and useflags if i think to need unstable and new software. Every time i merge something i got new worldfile records.

Doing wold updates less, makes it only more worse. However i try to find the best performance of time and less energy consumption.

However its good to split some packages of all kind like QT or Perl into a extra update slot/shift every month. To keep a world Update easy.

Did not now that python 3.6 and python 2.7 become the new standard. So i had some kind of worry with python 3.5 or 3.7 before and some Useflags.

But i know that are just two sides of the same medal. You can have a stable system or full control and have to know what you are doing.

eccerr0r · Post by **eccerr0r** » Fri Jan 10, 2020 11:59 pm

Backtrack is definitely being used due to the large number of conflicts ... due to a very large world file that have a lot of dependencies. The large package.use and package.keywords files I have aren't helping either.

I do have both python 2.7 and 3.6 installed. Not sure which one for this machine is being used for portage. I have a feeling my Atom may be using 2.7 for portage for whatever reason.

I am using an SSD so the portage tree isn't taking too long to read into RAM. I don't think portage uses tmpdir much as a scratch disk...everything stays in RAM. No swapping when doing dependency calculations.

Yes it would be nice that things would be multithreaded someday, though unsure how much it would help.

My machine just updated about 645 packages, and almost 1400 are installed on this machine, this should also affect runtime as existing software needs to be checked for conflicts even if they are not going to be updated. The 645 packages took 15 hours to finish on this quad core...

Post by Hu » Sat Jan 11, 2020 1:42 am

axl wrote:I read a discussion on IRC. I wasn't present, and didn't participate, but those people were suggesting: ahhh.. tmpfs is not that fast. it's actually bad to use it. what if you run out of memory. what do you do then?!

The kernel will start swapping to free memory.

axl wrote:First of all, I favor ramfs to tmpfs, which is tmpfs minus all that acl stuff, and second of all, it speeds up a lot of stuff.

Could you explain why ramfs is so much better than tmpfs? Most people won't set ACLs in a tmpfs, and if the ACL is not set, the cost of supporting it should be very small.

axl wrote:and finally, move /var/db/pkg in ram.

I would recommend that this be done only on a system which also stores /, /usr, /etc, /var, etc. in RAM. Since /var/db/pkg records what is installed, you will have a significant mess if you update packages, then lose the associated changes to /var/db/pkg to a shutdown/reboot.

axl wrote:i dont think portage starts with a list of cashed things it already has installed. it would use too much memory.

/var/db/pkg is the record of what is installed. It is not a particularly efficient representation, but it is easy to manage. Could you explain why you think there would be a substantial memory burden to keeping such a cache? What do you think would be in the cache that is not already persisted today?

axl wrote:yes, also those people on irc believe that kernel cache will handle everything by itself.

The kernel is, in general, quite good at handling caching. If a user program represents data in a way that requires substantial processing to convert from the persisted form to a useful form, kernel caching won't help with that, but neither will forcing the persisted form to be in RAM.

redblade7 · Post by **redblade7** » Sat Jan 11, 2020 1:52 am

I remember there being another thread about this a long time ago, where Gentoo people said they were working on a faster Git-based replacement, which could be turned on but currently had a bug where it would continually eat up space, and when that bug was fixed it would be made default. Any updates on that?

Post by **asturm** » Sat Jan 11, 2020 10:04 am

git doesn't influence dep resolution. Instead, if you picked the wrong mirror without metadata cache, it would explain why you see such long times for dep resolution.

Gentlenoob · Post by **Gentlenoob** » Sat Jan 11, 2020 11:43 am

Hi asturm (and others),

could you possibly elaborate some more? How can I tell which mirror offers (or not) that metadata cache thing?
I initially picked my mirror just by vague random geolocation considerations (Ruhr-Uni Bochum) according to the handbook with mirrorselect, and it seems to work well enough. Never would have guessed that it may have an influence on dep. resolution, though.

Thanks, Ralph

Post by **asturm** » Sat Jan 11, 2020 11:44 am

If you used mirrorselect (rsync) then don't waste any time thinking about it, this is only about people having switched to git for syncing.

eccerr0r · Post by **eccerr0r** » Sat Jan 11, 2020 5:12 pm

It seems one specific problem my system setup (USE, packages installed, package.mask, package.use, etc.) is that webkit-gtk cannot be upgraded. Then due to backtrack it tries to find some solution, taking forever.

Right now I'm getting around 14 minutes to do dependency solve. Adding some unmasks got the time down to 11 minutes, so unmasking doesn't necessarily increase time, it could go either way depending on what you unmask.

(Yeah, it's not metadata problem I think, as this is repeatable. metadata gets regenerated on the first time it's noticed incorrect, subsequent runs should be consistent?)

Sakaki · Post by **Sakaki** » Sat Jan 11, 2020 5:39 pm

Perhaps have a play with my emtee script (on GitHub, here)? It aims to be "a faster-startup emerge -DuU --with-bdeps=y --keep-going @world (et al.)".

This script is very much a WIP, but can speed up the dep resolution phase by up to an order of magnitude on slower systems (I use it on RPis). Works by treating emerge as a 'black box', asking it to do (as the name suggests) a pretend --emptytree emerge, working out how the resulting build schedule differs from the installed set (wrt versions, USE flags, and optionally slot revdeps), and then passing this over as a fully-qualified-atom (FQA) set to emerge for the actual build.

Doesn't deal with e.g. --changed-deps yet, but does handle --changed-use. From the readme:

The emtee process runs as follows:
1. Derive a full, versioned build list for the @world set and its entire deep dependency tree, via emerge --with-bdeps=y --pretend --emptytree --verbose [opts] @world, which Portage can do relatively quickly. The resulting list, as it is derived as if no packages were installed to begin with, will automatically contain all necessary packages at their 'best' versions (which may entail upgrades, downgrades, new slots etc. wrt the currently installed set).
2. Filter this list, by marking each fully-qualified atom (FQA=$CATEGORY/$PF) within it for building (or not). Begin with all FQAs unmarked.
* Then (pass 1), mark anything which isn't a block, uninstall or reinstall for build;
* Then (pass 2), check each reinstall, to see if its active USE flag set is changing (default behaviour), or if any of its USE flags are changing (-N/--newuse behaviour), and if so, mark that package for build (fortunately, the --verbose output from step 1 contains the necessary USE flag delta information to allow us to easily work this out).
* Then (pass 3), if -S/--force-slot-rebuilds is in use, for each marked package on the list whose slot or subslot is changing (also inferable from the phase 1 output), search /var/db/pkg/FQA/RDEPENDS (and DEPENDS, if --with-bdeps=y, the default, is active) for any matching slot dependencies. Mark each such located (reverse) dependency that is also on the original --emptytree list (and not a block or uninstall) for build.
Note that pass 3 is skipped by default, since the phase 4 emerge (aka the real emerge) will automatically trigger any necessary slot rebuilds anyway, so it is redundant except for in a few esoteric situations.
3. Iff -c/--crosscheck (or -C/--strict-crosscheck) passed, compare the FQA build list produced by invoking emerge --bdeps=y --pretend --deep --update [--changed-use|--newuse] [opts] @world (adapted for specified options appropriately), with that produced by invoking emerge --oneshot --pretend [opts] . If any differences are found, report them (and, additionally, stop the build in such a case, if -S/--strict-crosscheck specified). Also report a series of comparative (total elapsed wall-clock) timings for both alternatives, for benchmarking purposes.
Note: crosschecking should only be used for reassurance or benchmarking, as it will, of necessity, be slower than the baseline in total time cost (since the check involves running both that and the newer, --emptytree-based approach)! So, if your goal is to improve emerge times, do not pass -s/-S.
4. Invoke the real emerge, as: emerge --oneshot [opts] filtered-FQA-build-list-from-phase-2.

The script also has a (trivial) ebuild in the sakaki-tools overlay (here), but you can also just directly download and (after reading, for hygiene!) run it, if you wish.

Post by **Ionen** » Sat Jan 11, 2020 5:41 pm

asturm wrote:git doesn't influence dep resolution. Instead, if you picked the wrong mirror without metadata cache, it would explain why you see such long times for dep resolution.

Being a rebel I picked without metadata cache ON PURPOSE and just generate it myself. It's also pretty quick for me because I blacklisted many parts of ::gentoo I don't use like say.. dev-haskell/* is missing for me

(for anyone else reading, note that trimming ::gentoo does _not_ influence dependency resolution in any noticeable way)

I then also have a git daemon serving for the rest of my network (auto-updates when a machine attempt to sync with it rather than periodic with some time limitations not to hammer), and it also generate .rss with heavily-trimmed git changelogs for me to check in my rss reader.

redblade7 · Post by **redblade7** » Sat Jan 11, 2020 7:44 pm

asturm wrote:If you used mirrorselect (rsync) then don't waste any time thinking about it, this is only about people having switched to git for syncing.

Is the Git-based system ready for use or does it still have that eating space bug?

e3k · Post by **e3k** » Sat Jan 11, 2020 8:10 pm

how many installed packages do you count?

eccerr0r · Post by **eccerr0r** » Sat Jan 11, 2020 8:53 pm

Oops. I should have said "keywording" not unmasking. Keywording newer versions of software sometimes speeds things up to remove the need for backtracking. And the key word (pun intended) is still "sometimes."

For grins I tried timing an emptytree build to remove the need to sync existing packages. This time, it's much faster (though it failed probably due lack of package versions in portage that were installed in the past but now only a ~ version exists): took only 2min 30 sec.

The machine I'm working on this on is actually the one in my signature, though I backed off of overclocking to keep things cooler:
almost 1400 packages installed
@world: 163 packages (mostly trimmed from unneeded and careful use of --oneshot when appropriate)

axl · Post by **axl** » Sun Jan 12, 2020 12:40 am

Hu wrote:
axl wrote:I read a discussion on IRC. I wasn't present, and didn't participate, but those people were suggesting: ahhh.. tmpfs is not that fast. it's actually bad to use it. what if you run out of memory. what do you do then?!
The kernel will start swapping to free memory.
axl wrote:First of all, I favor ramfs to tmpfs, which is tmpfs minus all that acl stuff, and second of all, it speeds up a lot of stuff.
Could you explain why ramfs is so much better than tmpfs? Most people won't set ACLs in a tmpfs, and if the ACL is not set, the cost of supporting it should be very small.
axl wrote:and finally, move /var/db/pkg in ram.
I would recommend that this be done only on a system which also stores /, /usr, /etc, /var, etc. in RAM. Since /var/db/pkg records what is installed, you will have a significant mess if you update packages, then lose the associated changes to /var/db/pkg to a shutdown/reboot.
axl wrote:i dont think portage starts with a list of cashed things it already has installed. it would use too much memory.
/var/db/pkg is the record of what is installed. It is not a particularly efficient representation, but it is easy to manage. Could you explain why you think there would be a substantial memory burden to keeping such a cache? What do you think would be in the cache that is not already persisted today?
axl wrote:yes, also those people on irc believe that kernel cache will handle everything by itself.
The kernel is, in general, quite good at handling caching. If a user program represents data in a way that requires substantial processing to convert from the persisted form to a useful form, kernel caching won't help with that, but neither will forcing the persisted form to be in RAM.

its testing.

it's just testing.

you would assume that if you read every file from /var/db/pkg into the kernel as to cache it, it would not ever be required to redo that operation.

I'll give you a for instance. let's say I have a gentoo portage rsync server. ~ what is it? under a gig of data files. small potatoes. right?

let's say I have a user X doing find /var/db/repos/gentoo (or /usr/portage) -exec dd if='{}' of=/dev/null ';'.

Now, we are getting into specter territory. at least how I perceive it. What do you expect from your kernel. and cache management?

A user putting something in ram, maybe, not the best thing to keep in ram, for another user. now if root puts something specifically in ram... now .. that's another story.

so one point about caching is permissions.

the second that you asked me about tmpfs and ramfs. testing. honestly testing. in all testing ramfs is just ... a little bit faster. minus that xattr thing. portage actually complains that ramfs is not a real fs in certain situations. and it should. I mean portage should be allowed to complain that ramfs is just unreliable. will I stop use it? hell no

Post by Hu » Sun Jan 12, 2020 1:21 am

axl wrote:its testing.

it's just testing.

you would assume that if you read every file from /var/db/pkg into the kernel as to cache it, it would not ever be required to redo that operation.

You might assume that, but I wouldn't. I'm not aware of any specification that requires the kernel to behave in any particular way with regard to retaining data read from files in an in-memory cache, and absent specific documented promises from the relevant developers, I would assume that the caching policy could change in a subsequent release if the maintainers decide they have a better idea about how to handle it.

axl wrote:let's say I have a gentoo portage rsync server. ~ what is it? under a gig of data files. small potatoes. right?

let's say I have a user X doing find /var/db/repos/gentoo (or /usr/portage) -exec dd if='{}' of=/dev/null ';'.

Now, we are getting into specter territory. at least how I perceive it.

Spectre should not apply here. We are not engaged in speculative execution, merely speculative caching.

axl wrote:What do you expect from your kernel. and cache management?

I expect the kernel to manage the available cache as best it can within system policy limits, and that "best" is decided based on algorithms maintained by people who spend considerable effort optimizing this.

axl wrote:A user putting something in ram, maybe, not the best thing to keep in ram, for another user. now if root puts something specifically in ram... now .. that's another story.

I'm not aware of user permissions affecting the kernel's decision of what to speculatively cache. As for root affirmatively pre-caching something with tmpfs, the kernel is free to page out contents from tmpfs if it needs memory elsewhere. Based on how it speculatively pages out content to improve other file caching, it may even swap out a tmpfs to make room to cache files read by a user.

In general, I would advise against using observed results of tests as a guarantee that a particular behavior will always happen.

axl · Post by **axl** » Sun Jan 12, 2020 1:48 am

let me ask you something. let's assume 2 different users cache the same thing. either portage tree, or linux source. and one of them changes one thing.

now.. what do you expect of the kernel?

and why is it wrong for root to just put it in ram with LEAST possible guessing game. I know it is ram. I dont have to guess if it is. or if it was cashed.

yes you could OOM.

Code: Select all

[root@sanziana:/etc/portage/package.env]# free
              total        used        free      shared  buff/cache   available
Mem:      131654016    54979804    55631804       21744    21042408    62485288
Swap:     268435448    85864960   182570488
Sun Jan 12 - 03:44:03

or maybe you could worry about something else. even that terrifying large swap is not terrifying. zswap is cool. more fears that omg your linux will be inconsistent with itself... omg no...

))

rant: portage taking ~10 minutes to compute deps

rant: portage taking ~10 minutes to compute deps

Re: rant: portage taking ~10 minutes to compute deps