Long-lasting portage slowness issue: Really I/Os fault? No!

rvalles · Tux's lil' helper Joined: 19 Feb 2003 Posts: 121

Many times we've discussed portage slowness, that portage is slow, and at the end we always hear things like that it is because of the great amount of small files it does have to read, that filesystems are bad with this sort of I/O, etc.

Well, after many runs of "time emerge -Dupvt world" I came to "stabilize" (some ms more, some ms less, the point is that HD isn't touched anymore) the results to this figures:

spb · Posted: Sat Sep 10, 2005 4:05 pm Post subject:

The problem is mainly that current Portage's dependency resolution code is, in a word, stupid.

marduk · Retired Dev Joined: 20 Sep 2002 Posts: 78

Agreed, a lot of it can be attributed to the way that portage is written. For example, I still can't figure out why there are so many deep copies (which are expensive). That is one of the main reasons why "import portage" takes so long. I've had to rip out parts of portage.py and put it in my own code for packages.gentoo.org simply because importing the portage module takes way to long to be used practically in a cgi script.

Also, a lot of the code is just "old code" that hasn't been optimized. Consider the function grabfile() in portage_util.py. On my machine when I "import portage", grabfile() is called 1276 times (before I even call a function). The total time is 0.310 seconds. This isn't a lot by itself, but it all adds up. If you look at grabfile() it's using the old string module which has for a long time been deprecated. Strings are first-class objects in Python now. Also, file objects are generators, so reading all the lines of a file into memory and then iterating ofer those lines is no longer necessary.

I took that simple function, grabfile() and made a few changes to it:

Shadow Skill · Veteran Joined: 04 Dec 2004 Posts: 1023

I think that splitting the tree would also be very effective in terms of speeding things up for users and possibly even eliminate the need ffor the package. files. If the tree is split by Core Library > Window Manager > CVS derivatives therein you have approximately four to six trees not counting different architectures.[Assuming you need a misc category for certain applications.] If people want cvs applications they would just set the cvs tree as active, no more stupid "This package depends on foo but foo is masked by some really stupid config file." messages. The only files one would still need at all would be /et/cportage/package.mask and package.use. [There really needs to be a mask flag that only masks a package for the duration of the emerge operation, directly having to edit package.mask constantly is just dumb.]
_________________
Ware wa mutekinari.
Wa ga kage waza ni kanau mono nashi.
Wa ga ichigeki wa mutekinari.

"First there was nothing, so the lord gave us light. There was still nothing, but at least you could see it."

Jeremy_Z · l33t Joined: 05 Apr 2004 Posts: 671 Location: Shanghai

So when is the Gentoo Summer of Code : Rewrite portage ?

_________________
"Because two groups of consumers drive the absolute high end of home computing: the gamers and the porn surfers." /.
My gentoo projects, Kelogviewer and a QT4 gui for etc-proposals