Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Why has portage become so slow
View unanswered posts
View posts from last 24 hours

Goto page 1, 2, 3, 4, 5, 6, 7  Next  
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
percy_vere_uk
Apprentice
Apprentice


Joined: 13 Dec 2008
Posts: 210
Location: Dorset UK

PostPosted: Mon Sep 30, 2013 1:17 pm    Post subject: Why has portage become so slow Reply with quote

Hi

I have noticed recently that 'emerge -pv --update --deep --newuse world' takes several times longer than it used to. This now takes around 5 minutes, a couple of months ago this was less than a

minute.

Emerging a single package also takes several times longer than before. I have two gentoo systems on the same laptop and since bringing the second one up to date it is now acting the same.

Any ideas on the cause of this. One of the updates perhaps!

percy


Last edited by percy_vere_uk on Wed Oct 02, 2013 12:00 pm; edited 2 times in total
Back to top
View user's profile Send private message
yellowhat
Guru
Guru


Joined: 10 Sep 2008
Posts: 528

PostPosted: Mon Sep 30, 2013 8:02 pm    Post subject: Reply with quote

That's the same impression I get.
I am runnimg: sys-apps/portage-2.2.7
Back to top
View user's profile Send private message
PaulBredbury
Watchman
Watchman


Joined: 14 Jul 2005
Posts: 7310

PostPosted: Tue Oct 01, 2013 1:31 am    Post subject: Reply with quote

Just idle speculation: try byte-compiling, e.g.:

Code:
cd /usr/lib/python2.7/site-packages && { python -m compileall . ; python -O -m compileall . ; }
Back to top
View user's profile Send private message
dol-sen
Retired Dev
Retired Dev


Joined: 30 Jun 2002
Posts: 2805
Location: Richmond, BC, Canada

PostPosted: Tue Oct 01, 2013 1:45 am    Post subject: Reply with quote

The reason is the new EAPI 5 subslot capabilities as well as a few other features that make finding a working solution to the upgrade/merge path have to do a lot more work. With the subslot capability allowing for automatic re-building of pkgs, it has to do a lot more reverse dep checks to ensure it doesn't break other installed pkgs, plus it has to check/find solutions to upgrade/ignore other pkgs that may bring in slot conflicts.

Essentially, portage is doing more so you don't have to figure out conflicts, rebuild the stuff on your own.
_________________
Brian
Porthole, the Portage GUI frontend irc@freenode: #gentoo-guis, #porthole, Blog
layman, gentoolkit, CoreBuilder, esearch...
Back to top
View user's profile Send private message
xaviermiller
Bodhisattva
Bodhisattva


Joined: 23 Jul 2004
Posts: 8706
Location: ~Brussels - Belgique

PostPosted: Tue Oct 01, 2013 6:12 am    Post subject: Reply with quote

Bug #468486
Bug #484788
Bug #431484

For me, that delay is unacceptable, I cannot use emerge as I did before. Too slow.
_________________
Kind regards,
Xavier Miller
Back to top
View user's profile Send private message
Aiken
Apprentice
Apprentice


Joined: 22 Jan 2003
Posts: 239
Location: Toowoomba/Australia

PostPosted: Tue Oct 01, 2013 6:40 am    Post subject: Reply with quote

Had a look at those bug reports and saw someone with 11 minutes. My worst so far is 7 minutes with 4 - 5 common. With 12 gentoo images to look after that time adds up.

Core2duo 3.16GHz, 4G ram, wd 160G hd.

Code:

wilma ~ # echo 3 > /proc/sys/vm/drop_caches
wilma ~ # time emerge -pvuDN world

These are the packages that would be merged, in order:

Calculating dependencies... done!

Total: 0 packages, Size of downloads: 0 kB

real   4m16.130s
user   1m32.496s
sys   0m1.517s


vs ubuntu on an old semperon 1600, 512 meg ram and 80G hd.

Code:

root@max:~# echo 3 > /proc/sys/vm/drop_caches
root@max:~# time apt-get upgrade
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages have been kept back:
  kscreen linux-generic linux-headers-generic linux-image-generic
0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.

real   0m4.360s
user   0m1.760s
sys   0m0.108s

_________________
Beware the grue.
Back to top
View user's profile Send private message
xaviermiller
Bodhisattva
Bodhisattva


Joined: 23 Jul 2004
Posts: 8706
Location: ~Brussels - Belgique

PostPosted: Tue Oct 01, 2013 6:43 am    Post subject: Reply with quote

And compared to paludis ?
_________________
Kind regards,
Xavier Miller
Back to top
View user's profile Send private message
percy_vere_uk
Apprentice
Apprentice


Joined: 13 Dec 2008
Posts: 210
Location: Dorset UK

PostPosted: Tue Oct 01, 2013 12:17 pm    Post subject: Reply with quote

dol-sen has explained this slow down "essentially, portage is doing more so you don't have to figure out conflicts, rebuild the stuff on your own."

So it appears to be beyond my control, I have to say the delay is annoying though!

percy
Back to top
View user's profile Send private message
xaviermiller
Bodhisattva
Bodhisattva


Joined: 23 Jul 2004
Posts: 8706
Location: ~Brussels - Belgique

PostPosted: Tue Oct 01, 2013 12:24 pm    Post subject: Reply with quote

For me, it's not solved.
_________________
Kind regards,
Xavier Miller
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Tue Oct 01, 2013 12:44 pm    Post subject: Reply with quote

A certain factor of speed can still be obtained by using squashfs for the portage tree and for /var/db/pkg (e.g. sys-fs/squashmount from the mv overlay, see here). On my system the speed gain is still considerable, but your milage may vary. Roughly speaking, the more memoy your machine has and the faster it is anyway, the less will be the gain.
Back to top
View user's profile Send private message
xaviermiller
Bodhisattva
Bodhisattva


Joined: 23 Jul 2004
Posts: 8706
Location: ~Brussels - Belgique

PostPosted: Tue Oct 01, 2013 1:02 pm    Post subject: Reply with quote

I use squashfs, the lost time is not in i/o but in python code.
_________________
Kind regards,
Xavier Miller
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Tue Oct 01, 2013 1:22 pm    Post subject: Reply with quote

XavierMiller wrote:
I use squashfs,

Also for /var/db
Quote:
the lost time is not in i/o but in python code

Well, dependency resolution is an NP hard problem (just a guess, maybe it is even really exponential) which means that even for a decent number of packages the time may become unacceptable if you are close to a worst-case situation.
If you have a setup in which many backtracks are necessary, you are out of luck. It might help to put some "critical" dependencies to your world file to avoid unnecessary backtracking. It might also help to limit the backtracking depth.
Back to top
View user's profile Send private message
xaviermiller
Bodhisattva
Bodhisattva


Joined: 23 Jul 2004
Posts: 8706
Location: ~Brussels - Belgique

PostPosted: Tue Oct 01, 2013 1:30 pm    Post subject: Reply with quote

I don't want to tune portage. It worked fine until last months, it needs to be profiled / fine tuned.
_________________
Kind regards,
Xavier Miller
Back to top
View user's profile Send private message
percy_vere_uk
Apprentice
Apprentice


Joined: 13 Dec 2008
Posts: 210
Location: Dorset UK

PostPosted: Wed Oct 02, 2013 12:00 pm    Post subject: Reply with quote

XavierMiller said "For me, it's not solved."

OK I have removed the solved. Lets hope it is solved at some point.

percy
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Wed Oct 02, 2013 2:43 pm    Post subject: Reply with quote

XavierMiller wrote:
It worked fine until last months, it needs to be profiled / fine tuned.

You cannot resonably fine tune an NP-complete/exponential-worst-case algorithm. If you are in a bad case you have to change the case.
Back to top
View user's profile Send private message
xaviermiller
Bodhisattva
Bodhisattva


Joined: 23 Jul 2004
Posts: 8706
Location: ~Brussels - Belgique

PostPosted: Wed Oct 02, 2013 6:22 pm    Post subject: Reply with quote

I am a user, not a NP-Case 8O

My world is simplier this year than years ago, I see Portage has become too bloated and need some fitness to be useable again. It's a "normal" milestone, like other projects, some optimizations need to be done after too much evolutions in all directions.

If portage cannot be optimized, I will try other package management systems... and so would be on the slope to exit.
_________________
Kind regards,
Xavier Miller
Back to top
View user's profile Send private message
GFCCAE6xF
Apprentice
Apprentice


Joined: 06 Aug 2012
Posts: 295

PostPosted: Wed Oct 02, 2013 6:49 pm    Post subject: Reply with quote

Is there anything the seemingly very people affected by this have in common?

Personally, I haven't noticed any slow-downs during the past year or so and doing Aiken's basic benchmark/test myself I get a time of 36 seconds and 11 seconds on following tries without the drop cache play.
Back to top
View user's profile Send private message
xaviermiller
Bodhisattva
Bodhisattva


Joined: 23 Jul 2004
Posts: 8706
Location: ~Brussels - Belgique

PostPosted: Wed Oct 02, 2013 6:53 pm    Post subject: Reply with quote

I have 2 machines with almost the same world. The ~amd64 is OK and not slow, but the ~x86 is really slow. And compared to CPU frequencies, the ratio is not related only to frequencies differences.
_________________
Kind regards,
Xavier Miller
Back to top
View user's profile Send private message
TomWij
Retired Dev
Retired Dev


Joined: 04 Jul 2012
Posts: 1553

PostPosted: Wed Oct 02, 2013 7:54 pm    Post subject: Reply with quote

To get an idea what's going on, here is a slightly more verbose overview:

0:00: Calculating dependencies
0:07: Adding root packages
0:17: Processing dependencies
1:06: Adding root packages
1:08: Processing dependencies
1:43: Adding root packages
1:46: Processing dependencies
2:45: Checking for slot conflicts
2:45: Trying to accept blocker conflicts
2:45: Resolving slot conflicts for complete graph
2:45: Processing slot conflicts
2:45: Triggering slot operator reinstalls
3:18: Validating blockers
3:21: Checking for blocker conflicts
3:21: Checking for rebuild triggers
3:21: Checking if restart is needed
3:21: Checking if we have to prune rebuilds
3:21: Checking if restart is needed
3:21: Checking for parameters that change behavior
3:21: Checking for changes that are needed
3:21: Done resolving! ... done

(This output can be obtained with something like https://gist.github.com/TomWij/d25d1c8f4cf2b2122b0d)

As you can see, 2 minutes and 45 seconds is spent on building the Portage tree and 33 seconds on the slot conflicts.

There are options (--ignore-built-slot-operator-deps=y and --rebuild-if-new-slot=n) that can be played with to get rid of that 33 seconds; but if you do so, you are moving that time to other places where you might possibly have to spend a lot more time on it. I don't think it is really worth disabling it...

As for the 2 minutes and 45 seconds, that depends a lot on your Portage tree; while one way would be dealing with the code (which is a hard thing to do) you can also do it the other way:

  • Make sure you only have the overlays you actually need; if possible, just pull ebuilds from overlays to your own overlay. Having a lot of overlays can slow things down a lot.
  • Make sure you clean out /var/lib/portage/world once in a while to get rid of packages you don't actually need.
  • Whenever you merge something think about why the dependencies are pulled in (--tree --unordered-display), do you actually need them? Disable the USE flags if you don't.
  • Find packages that don't have a lot of dependencies for what they need to do, light-weight software can have its benefits.


All these will make the dependency tree smaller, which means it would be done faster.

Some ideas for dependency tree resolution involve even more caching (eg. for reducing the USE flags), making it parallel (but Python global interpreter lock sits in the way) or just plain rewriting it with an alternative backtracking algorithm.

Talking about backtracking, you might want to lower the value by default such that it doesn't consider too much; then when you get problems you can increase it temporarily and run again.

And for those interested, here is a call tree (http://i.imgur.com/A93CdNR.png) from a while ago and another call tree (http://i.imgur.com/UbxEUB2.png) that I captured last week. In the second one you see it run again (2x) to resolve some blockers or conflicts. There are multiple ways you can interpret what's heavy; but since I can't blindly trust that, it's why I'm looking at it from a top down approach (making it a bit more verbose with each try) but haven't found anything noteworthy to fix yet. I'm mostly looking at things that are done multiple time that could just be done a single time. (eg. A list that's not made unique; therefore having multiple entries, and processing the same entry twice is unnecessary)
Back to top
View user's profile Send private message
xaviermiller
Bodhisattva
Bodhisattva


Joined: 23 Jul 2004
Posts: 8706
Location: ~Brussels - Belgique

PostPosted: Wed Oct 02, 2013 8:14 pm    Post subject: Reply with quote

TL;DR ;) As I stated, I see me in that case as a user, not a developer. I don't want to debug Python, but to have a working and fast PMS.

Can't Portage be KISSer ?
_________________
Kind regards,
Xavier Miller
Back to top
View user's profile Send private message
xaviermiller
Bodhisattva
Bodhisattva


Joined: 23 Jul 2004
Posts: 8706
Location: ~Brussels - Belgique

PostPosted: Wed Oct 02, 2013 8:41 pm    Post subject: Reply with quote

OK, I will elaborate : I follow the gentoo-dev mailing list, and I see a lot of people having nice ideas, and proposing new cool features or technical/technological solutions for special cases on dependencies/conflicts automatic solutions.

Indeed, there are really really cool features and Portage is those last months become really clever.

But we are at some point the software has more intelligence than the user. And it costs a lot of processing to run those cool algorithms.

For me, this is too much, or too much for an interpreted language.

As developer, I always prefer KISS solutions over "user friendly" pathes that sometimes don't help the user but don't help him/her to learn, and hide everything in a complicated stuff.

So, what is the cost of the resulting system? In some cases, maybe nothing. But in other cases, its too CPU expensive. And nobody can help me to help developers to point where the problem is (except theoretical stuff about algorithm complexity or non-default fine-tune parameters to check).

I have only 1 overlay (pro-audio), and a lot of USE flags because I don't want to be bloated by *kit/systemd/non-KISS/anti-UNIX pushed dependancies.

After 10 years of a simple and efficient portage, I am lost in a bloated PMS, with a lot of cryptical concepts that are not well explained to existing users (I don't follow the User Guide each month to see the difference). All I want to do is sync / update / install stuff, not install 3rd-party software to compute dependencies more efficiently; eix is my favorite tool, but a proof that portage is bloated. revdep-rebuild was useful, I could understand I need to use it again.

It's a pity that the idea of paludis was not integrated/studied but generated a fork.

Please, try to cut down portage and optimize it. This would be a benefit for everybody.

KISS. No more.
_________________
Kind regards,
Xavier Miller
Back to top
View user's profile Send private message
TomWij
Retired Dev
Retired Dev


Joined: 04 Jul 2012
Posts: 1553

PostPosted: Wed Oct 02, 2013 11:36 pm    Post subject: Reply with quote

It is not so much about Portage becoming more complex alone; given that half minute added to three minutes that isn't really that much, and it spares out much more time than that half minute.

It's rather about the size and complexity of the Portage tree growing (http://i.imgur.com/mA6rMnY.png from http://tinyurl.com/gegraphs); as time goes by new packages come in, existing packages use new dependencies, programs grow more features (= often more dependencies covered under USE flags), more blockers come to live, more alternatives in virtuals and the list goes on...

Back in the middle ages post could be handled by one or two persons sitting at the post office, nowadays, our cities and connections are so crowded that automatic sorting machines and what not are necessary to keep up; so, that being said, Portage does a very good job at dealing with the complexity it is provided with. If there's one place that a gain could be found, it is in the complexity that we are building; and not necessarily in the package manager.

Although if there is a way to deal with the package manager instead of the Portage tree; then yes, it would be handy to be able to do it there. But there becomes a point you simply cannot optimize in big steps anymore...

(Word aside; I am not a Portage developer, despite a single commit)

Quote:
And nobody can help me to help developers to point where the problem is (except theoretical stuff about algorithm complexity or non-default fine-tune parameters to check).


Well, algorithm complexity is what is it going to be all about; because it defines the borders of how much you can improve. There's three major ways to increase performance; that's either to make a single action take less time (eg. caching), use parallelism (as it cuts complexity) or finding a more fit algorithm that is less complex in terms of time and/or space.

Both caching (there is quite some of this already, it is somewhat harder to find out what still needs a cache and how to bring that cache into the code) and parallelism (the Python's global interpreter lock doesn't allow us, but perhaps PyPy could be a long term solution) are two options to really consider; the other option is to simply rewrite the code using some other algorithm (but that's a matter of getting back to the academic drawing board). Whichever way you pick, there is a lot of work involved with not much persons interested (*yet*; as it is bound to get worse, more people will eventually jump in to solve this growing problem); work that progresses in running time, but might put Portage behind in other ways.

The other option is to continue on one of the forks out there [eg. pkgcore] (or this extremely early / small fork that I started); but given the amount of work that involves and that it doesn't guarantee that everyone will end up using it, I don't see it as something I should do until it really solves an actual practical problem. I'm not too bothered of my entire world dependency tree resolution taking four minutes; with the amount of packages I have, it doesn't really seem too much...


Last edited by TomWij on Thu Oct 03, 2013 7:48 pm; edited 1 time in total
Back to top
View user's profile Send private message
TomWij
Retired Dev
Retired Dev


Joined: 04 Jul 2012
Posts: 1553

PostPosted: Wed Oct 02, 2013 11:44 pm    Post subject: Reply with quote

Aiken wrote:
wilma ~ # time emerge -pvuDN world
root@max:~# time apt-get upgrade


Please note that when you do that you are comparing apples and eggs; comparing resolution of a dependency tree with binaries (that don't need rebuilds) against resolution of a dependency tree with source based USE flag controlled keyworded/unmasked ebuilds and so on, you will be simply looking at different complexities by design.

mv wrote:
A certain factor of speed can still be obtained by using squashfs for the portage tree and for /var/db/pkg (e.g. sys-fs/squashmount from the mv overlay, see here). On my system the speed gain is still considerable, but your milage may vary. Roughly speaking, the more memoy your machine has and the faster it is anyway, the less will be the gain.


This might even get better:

http://thread.gmane.org/gmane.linux.kernel/1562414
http://thread.gmane.org/gmane.linux.kernel/1567802
Back to top
View user's profile Send private message
modnaruved
Apprentice
Apprentice


Joined: 21 Mar 2011
Posts: 158

PostPosted: Thu Oct 03, 2013 12:05 am    Post subject: Reply with quote

I dont update today and not saw any delays.


Just thinking in dream: maybe use erlang for package manager? but its more dependencies and not wide own package system and libs, but it can utilize all CPU-related resources for speed.... also git DCVS will help to fast sync


Last edited by modnaruved on Thu Oct 03, 2013 2:59 pm; edited 1 time in total
Back to top
View user's profile Send private message
Aiken
Apprentice
Apprentice


Joined: 22 Jan 2003
Posts: 239
Location: Toowoomba/Australia

PostPosted: Thu Oct 03, 2013 12:10 am    Post subject: Reply with quote

TomWij wrote:
I'm not too bothered of my entire world dependency tree resolution taking four minutes; with the amount of packages I have, it doesn't really seem too much...


It adds up. By the time I get to the 12th machine the short time of apt-get upgrade to see if anything needs updating gets quite tempting.
_________________
Beware the grue.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Goto page 1, 2, 3, 4, 5, 6, 7  Next
Page 1 of 7

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum