Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Why isn't Portage multithreaded?
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
dE_logics
Advocate
Advocate


Joined: 02 Jan 2009
Posts: 2253
Location: $TERM

PostPosted: Thu Mar 03, 2011 1:10 am    Post subject: Reply with quote

This discussion has grown big.


I gave another thought to this the next day when I was emerging VirtualBox.

After a -pv, most of the info that portage works on is cached to ram. Thus, in this case (i.e second or third emerge with -pv) is bound to give advantage.

+ Multi threading is possible. You can calculate the dependency of a package in separate threads; that wont change the outcome.
_________________
My blog
Back to top
View user's profile Send private message
Genone
Retired Dev
Retired Dev


Joined: 14 Mar 2003
Posts: 9532
Location: beyond the rim

PostPosted: Thu Mar 03, 2011 5:31 pm    Post subject: Reply with quote

Multi threading won't give much of a benefit anyway as dependency calculation can't be split up as much as you think, because decisions in one branch may influence decisions in other branches (e.g. any-of deps). Getting that right in a concurrent way would add so much overhead (and memory usage) to eat up any improvements, and would make debugging and maintenance much harder in the long run (debugging multithreaded apps is always a pain).
Also I'm not sure if this still applies, but python used to have some issues with multithreaded apps due to the global interpreter lock, in fact Guide recommended to use multiprocessing instead (which wouldn't be a good idea for dep resolution due to startup times and rather short-living processes).
And as has been mentioned, IO is always an issue with portage as it has to access (not necessarily read) up to many thousand files for a system-wide dep resolution.

While portage has had a lot of performance improvements in the last years, a good part is eaten up again by additional complexity in the dep resolved needed for new requirements like use-/slot-deps, improved package selection and EAPI support in general.
Back to top
View user's profile Send private message
Yamakuzure
Advocate
Advocate


Joined: 21 Jun 2006
Posts: 2284
Location: Adendorf, Germany

PostPosted: Fri Mar 04, 2011 11:45 am    Post subject: Reply with quote

sparc wrote:
Confused? Portage is a complex tool with too many options. The thing is that these options do not behave the same when combined together. See if I enter --emptytree above I get even shorter times (around 20 something seconds). The time changes if I enter --newuse also. So what is the problem, you might ask.

The problem is the following:
Code:

# time emerge --pretend --update --deep --with-bdeps=y --update @world

These are the packages that would be merged, in order:

Calculating dependencies... done!
...
(a slightly longer package list)
...

real    3m31.638s
user    3m28.788s
sys     0m2.550s


What changed? A simple option called --deep, in combination with everything else. Whatever the case on YOUR system, I'm not going to continue following this thread. That is because in my book the 30 seconds that satisfy you all is still slow performance. And as a last remark, if you all exit your comfort zone and start playing around with portage's options, you too can make it run for 3-5 minutes.
Don't you ever read? Oh man this is absolutely ridiculous! I used all these Options, just see above! And I did not "make it to 3-5 minutes".

With this attitude you get yourself quicker on every ignore-list than is took to write this.

And again: Forget flat files, try sqlite. "man portage" -> "/ sqlite" -> read it, install "dev-python/pysqlite" -> create /etc/portage/modules like stated in the man page -> run "emerge --metadata" -> be happy.
_________________
Important German:
  1. "Aha" - German reaction to pretend that you are really interested while giving no f*ck.
  2. "Tja" - German reaction to the apocalypse, nuclear war, an alien invasion or no bread in the house.
Back to top
View user's profile Send private message
Suicidal
l33t
l33t


Joined: 30 Jul 2003
Posts: 959
Location: /dev/null

PostPosted: Fri Mar 04, 2011 1:05 pm    Post subject: Reply with quote

I got upto 1m6sec with this redicilious command.

utterly redundant and retarded emerge command:
emerge -uDDDepvN --with-bdeps=y --tree --color=y world

The problem with most PhD's is that they already know everything.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Fri Mar 04, 2011 10:42 pm    Post subject: Reply with quote

Suicidal wrote:
I got upto 1m6sec with this redicilious command.

Ha, I got 0:55.92 on a cold cache and 0:12.57 on a re-run. :lol:

(which is already lightning-fast compared to the glacial time it takes for paludis: 3:05.09/1:30.81 for `cave resolve --everything installed-slots`, and that's *with* the ridiculous effort it takes to set up metadata caches in it!)
Back to top
View user's profile Send private message
Suicidal
l33t
l33t


Joined: 30 Jul 2003
Posts: 959
Location: /dev/null

PostPosted: Fri Mar 04, 2011 11:41 pm    Post subject: Reply with quote

Ohh snap,

I forgot to add --metadata to that string ;- )
Back to top
View user's profile Send private message
jamapii
l33t
l33t


Joined: 16 Sep 2004
Posts: 637

PostPosted: Mon Mar 07, 2011 6:23 pm    Post subject: Reply with quote

I use emerge -j4 with portage 2.2

This isn't perfect, but should be better than nothing.

However, sometimes builds fail, and then succeed when building serially.
Back to top
View user's profile Send private message
_______0
Guru
Guru


Joined: 15 Oct 2012
Posts: 521

PostPosted: Tue Oct 16, 2012 8:48 am    Post subject: good title thread wrong useless implementation. Reply with quote

mm..

The title of the thread instantly caught my attention as I also find some things portage does utterly WTFs. The parallelization being talk here is good but not useful.

I think compiling world in gentoo would be cut by as much as 3/4 just by optimizing different stages of the emerging process. As cores keep increasing there are certain stages of the emerge process that look retarded and slow down the overall updating. Indeed packages do emerge faster but tha part that'd I've noticed that slows down installing stuff is simple things like "checking".

To illustrate my point let's examine the gcc ebuild. Emerge spends much of the initial time and other parts just doing dumb tasks like the followings:

Code:
 * Applying Gentoo patches ...                         
 *   01_all_joined-cpp-defs.patch ...                                                                                                   [ ok ]
 *   03_all_java-nomulti.patch ...                                                                                                      [ ok ]
 *   05_all_gcc-4.6.x-siginfo.patch ...                                                                                                 [ ok ]
 *   10_all_default-fortify-source.patch ...                                                                                            [ ok ]
 *   11_all_default-warn-format-security.patch ...                                                                                      [ ok ]
 *   12_all_default-warn-trampolines.patch ...                                                                                          [ ok ]
 *   15_all_libgfortran-Werror.patch ...                                                                                                [ ok ]
 *   15_all_libgomp-Werror.patch ...                                                                                                    [ ok ]
 *   16_all_libgo-Werror-pr53679.patch ...                                                                                              [ ok ]
 *   25_all_alpha-mieee-default.patch ...                                                                                               [ ok ]
 *   26_all_alpha-asm-mcpu.patch ...                                                                                                    [ ok ]
 *   29_all_arm_armv4t-default.patch ...                                                                                                [ ok ]
 *   33_all_armhf.patch ...                                                                                                             [ ok ]
 *   34_all_ia64_note.GNU-stack.patch ...                                                                                               [ ok ]
 *   38_all_sh_pr24836_all-archs.patch ...                                                                                              [ ok ]
 *   42_all_superh_default-multilib.patch ...                                                                                           [ ok ]
 *   50_all_libiberty-asprintf.patch ...                                                                                                [ ok ]
 *   51_all_libiberty-pic.patch ...                                                                                                     [ ok ]
 *   52_all_netbsd-Bsymbolic.patch ...                                                                                                  [ ok ]
 *   64_all_gcc-hppa-64bit-pr52408.patch ...                                                                                            [ ok ]
 *   65_all_gcc-hppa-section-conflicts-pr52999.patch ...                                                                                [ ok ]
 *   74_all_gcc46_cloog-dl.patch ...                                                                                                    [ ok ]
 *   76_all_4.7.0_c-family-headers.patch ...                                                                                            [ ok ]
 *   92_all_freebsd-pie.patch ...                                                                                                       [ ok ]
 * Done with patching                                 
 * Applying uClibc patches ...                         
 *   90_all_100-uclibc-conf.patch ...                                                                                                   [ ok ]
 *   90_all_301-missing-execinfo_h.patch ...                                                                                            [ ok ]
 *   90_all_302-c99-snprintf.patch ...                                                                                                  [ ok ]
 *   90_all_305-libmudflap-susv3-legacy.patch ...                                                                                       [ ok ]
 * Done with patching                                 
 * Applying pie patches ...                           
 *   10_all_gcc45_configure.patch ...                                                                                                   [ ok ]
 *   11_all_gcc45_config.in.patch ...                                                                                                   [ ok ]
 *   12_all_gcc46_Makefile.in.patch ...                                                                                                 [ ok ]
 *   13_all_gcc46_ssp_uclibc_check.patch ...                                                                                            [ ok ]
 *   20_all_gcc46_gcc.c.patch ...                                                                                                       [ ok ]
 *   21_all_gcc44_decl-tls-model.patch ...                                                                                              [ ok ]
 *   22_all_gcc46-default-ssp.patch ...                                                                                                 [ ok ]
 *   30_all_gcc46_esp.h.patch ...                                                                                                       [ ok ]
 *   33_all_gcc46_config_rs6000_linux64.h.patch ...                                                                                     [ ok ]
 *   35_all_gcc46_config_crtbeginp.patch ...                                                                                            [ ok ]
 *   60_all_gcc44_invoke.texi.patch ...                                                                                                 [ ok ]
 * Done with patching


And this:

Code:
checking for kill... yes
checking for getrlimit... yes
checking for setrlimit... yes
checking for atoll... yes
checking for atoq... no
checking for sysconf... yes
checking for strsignal... yes
checking for getrusage... yes
checking for nl_langinfo... yes
checking for gettimeofday... yes
checking for mbstowcs... yes
checking for wcswidth... yes
checking for mmap... yes
checking for setlocale... yes
checking for clearerr_unlocked... yes
checking for feof_unlocked... yes
checking for ferror_unlocked... yes
checking for fflush_unlocked... yes
checking for fgetc_unlocked... yes
checking for fgets_unlocked... yes
checking for fileno_unlocked... yes
checking for fprintf_unlocked... no
checking for fputc_unlocked... yes
checking for fputs_unlocked... yes
checking for fread_unlocked... yes
checking for fwrite_unlocked... yes
checking for getchar_unlocked... yes
checking for getc_unlocked... yes
checking for putchar_unlocked... yes
checking for putc_unlocked... yes
checking whether mbstowcs works... yes
checking for ssize_t... yes
checking for caddr_t... yes
checking for sys/mman.h... (cached) yes
checking for mmap... (cached) yes
checking whether read-only mmap of a plain file works... yes
checking whether mmap from /dev/zero works... yes
checking for MAP_ANON(YMOUS)... yes
checking whether mmap with MAP_ANON(YMOUS) works... yes
checking for pid_t... yes
checking for vfork.h... no
checking for fork... yes
checking for vfork... yes
checking for working fork... yes
checking for working vfork... (cached) yes
checking for ld used by GCC... /usr/lib/gcc/x86_64-pc-linux-gnu/4.6.3/../../../../x86_64-pc-linux-gnu/bin/ld
checking if the linker (/usr/lib/gcc/x86_64-pc-linux-gnu/4.6.3/../../../../x86_64-pc-linux-gnu/bin/ld) is GNU ld... yes
checking for shared library run path origin... done
checking for iconv... yes
checking for iconv declaration... install-shextern size_t iconv (iconv_t cd, char * *inbuf, size_t *inbytesleft, char * *outbuf, size_t *outbytesleft);
checking for LC_MESSAGES... yes
checking for nl_langinfo and CODESET... yes
checking whether getenv is declared... yes
checking whether getenv is declared... yes
checking whether atol is declared... yes
checking whether asprintf is declared... yes
checking whether sbrk is declared... yes
checking whether abort is declared... yes
checking whether atof is declared... yes
checking whether getcwd is declared... yes
checking whether getwd is declared... yes
checking whether strsignal is declared... yes
checking whether strstr is declared... yes
checking whether strverscmp is declared... yes
checking whether errno is declared... yes
checking whether snprintf is declared... yes
checking whether vsnprintf is declared... yes
checking whether vasprintf is declared... yes
checking whether malloc is declared... yes
checking whether realloc is declared... yes
checking whether calloc is declared... yes
checking whether free is declared... yes
checking whether basename is declared... yes
checking whether getopt is declared... no
checking whether clock is declared... yes
checking whether getpagesize is declared... yes
checking whether clearerr_unlocked is declared... yes
checking whether feof_unlocked is declared... yes
checking whether ferror_unlocked is declared... yes
checking whether fflush_unlocked is declared... yes
checking whether fgetc_unlocked is declared... yes
checking whether fgets_unlocked is declared... yes
checking whether fileno_unlocked is declared... yes
checking whether fprintf_unlocked is declared... no
checking whether fputc_unlocked is declared... yes
checking whether fputs_unlocked is declared... yes
checking whether fread_unlocked is declared... yes
checking whether fwrite_unlocked is declared... yes
checking whether getchar_unlocked is declared... yes
checking whether getc_unlocked is declared... yes
checking whether putchar_unlocked is declared... yes
checking whether putc_unlocked is declared... yes
checking whether getrlimit is declared... yes
checking whether setrlimit is declared... yes
checking whether getrusage is declared... yes
checking whether ldgetname is declared... no
checking whether times is declared... yes
checking whether sigaltstack is declared... yes
checking for struct tms... yes
checking for clock_t... yes
checking for .preinit_array/.init_array/.fini_array support... yes


And this:

Code:
Searching /usr/include/.
 Searching /usr/include/./libunrar
 Searching /usr/include/./postgresql
 Searching /usr/include/./libpq
 Searching /usr/include/./boost
 Searching /usr/include/./schily/scg
 Searching /usr/include/./scsilib
 Searching /usr/include/./quicktime


And this:

Code:

Applying io_quotes_use            to linux/uinput.h
Applying io_quotes_use            to linux/usb/tmc.h
Applying io_quotes_use            to linux/input.h
Applying io_quotes_use            to linux/suspend_ioctls.h
Applying io_quotes_use            to linux/ptp_clock.h
Applying io_quotes_use            to linux/reiserfs_fs.h
Applying io_quotes_use            to linux/cm4000_cs.h
Applying io_quotes_use            to linux/watchdog.h
Applying io_quotes_use            to linux/pktcdvd.h
Applying io_quotes_use            to linux/synclink.h
Applying machine_name             to linux/a.out.h
Fixed:  linux/a.out.h
Applying io_quotes_use            to linux/kvm.h
Applying io_quotes_use            to linux/dm-ioctl.h
Applying io_quotes_use            to linux/agpgart.h
Applying io_quotes_use            to linux/dn.h
Applying io_quotes_use            to linux/mmtimer.h
Applying io_quotes_use            to linux/random.h
Applying io_quotes_def            to linux/version.h
Applying io_quotes_use            to linux/nbd.h
Applying io_quotes_use            to linux/atmbr2684.h
Applying io_quotes_use            to linux/ppdev.h
Applying io_quotes_use            to linux/phantom.h
Applying io_quotes_use            to linux/media.h
Applying io_quotes_use            to linux/mmc/ioctl.h
Applying io_quotes_use            to linux/raw.h
Applying io_quotes_use            to linux/spi/spidev.h
Applying io_quotes_use            to linux/blkpg.h
Applying io_quotes_use            to linux/auto_fs.h
Applying io_quotes_use            to linux/i2o-dev.h
Applying io_quotes_use            to linux/fd.h
Applying io_quotes_use            to linux/vhost.h
Applying io_quotes_use            to linux/auto_fs4.h
Applying io_quotes_use            to linux/cciss_ioctl.h
Applying io_quotes_use            to linux/raid/md_u.h
Applying io_quotes_use            to linux/ipmi.h
Applying io_quotes_def            to linux/pci_regs.h
Applying io_quotes_def            to linux/ppp-comp.h
Applying io_quotes_use            to linux/rfkill.h
Applying io_quotes_use            to linux/gigaset_dev.h
Applying io_quotes_use            to linux/fs.h
Applying io_quotes_def            to linux/soundcard.h
Applying io_quotes_use            to linux/omapfb.h
Applying io_quotes_use            to linux/if_pppox.h
Applying io_quotes_use            to linux/hsi/hsi_char.h
Applying io_quotes_use            to linux/vfio.h
Applying pthread_incomplete_struct_argument to pthread.h
Applying hpux8_bogus_inlines      to math.h
Applying ctrl_quotes_def          to readline/chardefs.h
Applying io_quotes_use            to mtd/ubi-user.h
Applying io_quotes_use            to sys/mount.h
Applying io_quotes_use            to sys/raw.h
Applying glibc_stdint             to stdint.h
Applying io_quotes_use            to asm/mtrr.h


Code:
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/as.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/gprof.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/addr2line.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/ar.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/dlltool.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/nlmconv.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/nm.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/objcopy.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/objdump.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/ranlib.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/readelf.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/size.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/strings.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/strip.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/elfedit.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/windres.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/windmc.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/c++filt.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/ld.1.bz2
--- /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/ld.info
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/binutils.info
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/bfd.info
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/gprof.info
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/as.info
>>> Safely unmerging already-installed instance...
No package files given... Grabbing a set.
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/strip
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/strings
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/size
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/readelf
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/ranlib
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/objdump
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/objcopy
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/nm
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/ld.gold
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/ld.bfd
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/ld
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/gprof
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/elfedit
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/c++filt


These stages are present in nearly ALL packages. It often happens, on multiple corez, that compiling is faster, in as blink of an eye, than the "checking" process!! I am sure ppl know what I am talking about just observe brutal "checking" of binutils, it's like eternal. Now multiply the time emerge spends with during these steps by 1000 or more packages installed on a system!!!

Now, I ain't no h4x0r, so I can't go into the codez and fix it myself but the thing doesn't look like it'll be complicated to fix. It doesn't require any code optimization since there's no gcc or C code to compile. It's only a script that does stuff serialy, for any of the gentoo coders should be a piece of cake to fix the parallelization of these ebuild stages.

I thought about this one time when I was recursively extracting with 7z. Since 7z doesn't have wildcard decompression/extraction (to my surprise as the rest of compressing tools, bzip, tar, etc). The only solution in out there on the webz where the usual "for i in ...". Seeing how it did it one by one I searched more 7z options and added maximum parallelization but "for i in ..." was still one at a time.

The I did some experiment whether there was a faster way. Don't laugh if this looks rudimentary. What I did is add all 500 7z packages an a single line with && manually. And indeed there was a drastic change in the time decompressing!! The solution looked like this:

Code:
7z x foo && 7z x bar && <500 hundred times>


I don't know how emerge does those checking with "for in in ..." or what but nowadays with many cores, and massive amounts of ram "for i in ..." looks outdated and NOT a solution. In general "for i in ..." be for converting songs from one format to another, to use with ffmpeg, to move things, etc is not being kept up to date with hardware advances. There's more parallel optimization at compiler level (harder to figure it out) than a simple script!! Scripts don't suffer from race condition either, so I think it should be trivial to fix this speed bottleneck in emerge.

Plz, somebody implement my ideas on gcc ebuild, is not hard. Something like substituting the part:

Code:
for in in [b]checking[/b] ...


With:

Code:
checking whether ... && checking whether ... &&  checking whether ... && checking whether ... && checking whether ... && ...


As I said in the beginning by my estimates tweaking the ebuilds this weay should cut compiling world time by a brutal 3/4.

Can someone provide a fixed gcc ebuild today to see the difference??

thnx in advanced.
Back to top
View user's profile Send private message
Scimmia22
n00b
n00b


Joined: 24 Sep 2012
Posts: 10

PostPosted: Tue Oct 16, 2012 9:35 am    Post subject: Reply with quote

________________________0, you're completely missing the fact that the "Checking" isn't a function of the ebuild or portage, but of the configure script included as part of the source code to setup the makefiles.
Back to top
View user's profile Send private message
boerKrelis
Apprentice
Apprentice


Joined: 01 Jul 2003
Posts: 241
Location: The Netherlands

PostPosted: Tue Oct 16, 2012 9:43 am    Post subject: Reply with quote

________________________0, you are unclear on the concepts. Portage != automake/autoconf .

Also, you are talking about two things: 1) portage doing stuff that you think is unnecessary 2) portage not doing various stuff in parallel.

Also, many things you mention either cannot be parallelized (e.g., applying patches should be done in-order) or are I/O-bound.

Also, O("for thing in array_of_things; do $thing") == O(thing1 && thing2 && thing2 && thingN) == O(n) (if thingN == O(1)).
You ran
Code:

7z x foo && 7z x bar && <500 hundred times>   

which runs them *serially*. Your observed speed boost in your second experiment most probably comes from the file & dentry cache; you should drop your caches before you benchmark (I think I mentioned earlier on in this thread how to do that), otherwise you'll be benchmarking the cache.

If you want to parallelize either of these semantically identical constructs you should investigate the upstream gcc build process (autoconf/automake) and parallelize _that_, portage is not involved here. Portage can then take advantage of your upstream parallelization patches through MAKEOPTS.

So if you want to improve your personal portage experience you'll either need to develop an understanding of building software and the portage/build system boundaries (== "become h4x0r") , or run emerge with "--quiet-build=y" as to lessen the confusion caused by all these messages of which you do not yet fully grasp purpose or meaning ;-)

Portage supports parallelization through MAKEOPTS="-jN" and "--jobs".

Good luck on your journey into the not-so-simple world of software!
Back to top
View user's profile Send private message
pigeon768
l33t
l33t


Joined: 02 Jan 2006
Posts: 683

PostPosted: Thu Oct 18, 2012 10:32 pm    Post subject: Reply with quote

sparc wrote:
On my faulty machine I get the following:
Code:
# time emerge --pretend --update --with-bdeps=y --update @world

These are the packages that would be merged, in order:

Calculating dependencies... done!
...
(package list)
...

real    0m36.832s
user    0m36.301s
sys     0m0.461s
Confused? Portage is a complex tool with too many options. The thing is that these options do not behave the same when combined together. See if I enter --emptytree above I get even shorter times (around 20 something seconds). The time changes if I enter --newuse also. So what is the problem, you might ask.

The problem is the following:
Code:
# time emerge --pretend --update --deep --with-bdeps=y --update @world

These are the packages that would be merged, in order:

Calculating dependencies... done!
...
(a slightly longer package list)
...

real    3m31.638s
user    3m28.788s
sys     0m2.550s
What changed? A simple option called --deep, in combination with everything else. Whatever the case on YOUR system, I'm not going to continue following this thread. That is because in my book the 30 seconds that satisfy you all is still slow performance. And as a last remark, if you all exit your comfort zone and start playing around with portage's options, you too can make it run for 3-5 minutes.

cheers
Code:
~ $ time emerge -uNDpv @world --with-bdeps y

These are the packages that would be merged, in order:

Calculating dependencies... done!

Total: 0 packages, Size of downloads: 0 kB

real   0m14.321s
user   0m14.145s
sys   0m0.142s
-D is --deep.

If it's actually taking 3 1/2 minutes to do an emerge -uNDpv @world, you have other problems. There are some things I'd like to see portage do faster, but I'm not terribly concerned about the lack of multithreading. Multithreading in python sucks eggs anyway.
Back to top
View user's profile Send private message
_______0
Guru
Guru


Joined: 15 Oct 2012
Posts: 521

PostPosted: Tue Dec 18, 2012 6:50 pm    Post subject: Reply with quote

boerKrelis wrote:
________________________0, you are unclear on the concepts. Portage != automake/autoconf .

Also, you are talking about two things: 1) portage doing stuff that you think is unnecessary 2) portage not doing various stuff in parallel.

Also, many things you mention either cannot be parallelized (e.g., applying patches should be done in-order) or are I/O-bound.

Also, O("for thing in array_of_things; do $thing") == O(thing1 && thing2 && thing2 && thingN) == O(n) (if thingN == O(1)).
You ran
Code:

7z x foo && 7z x bar && <500 hundred times>   

which runs them *serially*. Your observed speed boost in your second experiment most probably comes from the file & dentry cache; you should drop your caches before you benchmark (I think I mentioned earlier on in this thread how to do that), otherwise you'll be benchmarking the cache.

If you want to parallelize either of these semantically identical constructs you should investigate the upstream gcc build process (autoconf/automake) and parallelize _that_, portage is not involved here. Portage can then take advantage of your upstream parallelization patches through MAKEOPTS.

So if you want to improve your personal portage experience you'll either need to develop an understanding of building software and the portage/build system boundaries (== "become h4x0r") , or run emerge with "--quiet-build=y" as to lessen the confusion caused by all these messages of which you do not yet fully grasp purpose or meaning ;-)

Portage supports parallelization through MAKEOPTS="-jN" and "--jobs".

Good luck on your journey into the not-so-simple world of software!


'good luck' for which 'purpose and meaning'??? u's crazy. First off you would have deduce that I MADE A MISTAKE and meant & and NOT &&. If I am speaking about parallelization it's quite obvious that I would not be speaking about &&, more over && has a linear progress on the terminal.

So do run an experiment with 7z and & then check htop. Is quite amusing.

Also parallelization of autoconf, automake and binutils (make or whatever the scripts are) is NOT about my personal pleasure ('personal experience'). My idea is to parallelize the "Checking ..." stage to take an instant instead of several minutes. Why has nobody parallelize this?? I can't wait.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Tue Dec 18, 2012 9:07 pm    Post subject: Reply with quote

________________________0 wrote:
Also parallelization of autoconf, automake and binutils (make or whatever the scripts are) is NOT about my personal pleasure ('personal experience'). My idea is to parallelize the "Checking ..." stage to take an instant instead of several minutes. Why has nobody parallelize this?? I can't wait.

http://www.gnu.org/software/autoconf/autoconf.html

I'm sure they'd love to hear your idea. Gentoo has nothing to do with the development of the GNU toolchain.
Back to top
View user's profile Send private message
Akkara
Bodhisattva
Bodhisattva


Joined: 28 Mar 2006
Posts: 6702
Location: &akkara

PostPosted: Wed Dec 19, 2012 2:15 am    Post subject: Reply with quote

________________________0 wrote:
First off you would have deduce that I MADE A MISTAKE and meant & and NOT &&. If I am speaking about parallelization it's quite obvious that I would not be speaking about &&, more over && has a linear progress on the terminal.

Wasn't obvious here, either. I read your post, saw the '&&', and I did in fact wonder,
    "is && really that much faster than for i in...".
I was preparing to go try it myself to check, when I read your reply. Both are valid syntax. And when presented with a unexpected claim, made in a thread about improving performance, my first thought was indeed, is there something wrong with the shell's parsing method that needs to be brought to attention?

But, as others have said, all this "checking..." stuff is not portage. It's part of the package's build system. Just so happens many packages use "autotools" for configuration assistance. It's got lots of history. Used to be, when coding C, in one platform you'd have to include <strings.h>. In another it might be <string.h>. Even now, less standard stuff could be in different places. And for packages that work in both linux and MS.Windows, lots of stuff needs to be done differently. That's what's behind all those "checking" messages: it's finding out how to do the thing in question on the computer you're currently running.

And, you're right. Most packages check for much of the same things. The place to address this would be in autotools itself, if it can maintain a cache of recently-discovered answers to these questions, it would go a lot faster.

But even that isn't as simple as it sounds: Imagine a music player. Auto-configures to use all the codecs you've got installed. "Checking for ogg..."; "Checking for flac..."; "Checking for mp3..."; and so on. Now you install a new codec and go re-compile the player app. With cached autotools answers, it'll never detect the new one, because it already "knows" the answer is "no".

It's not a easy problem to solve without introducing new subtle problems.
_________________
Many think that Dilbert is a comic. Unfortunately it is a documentary.
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Wed Dec 19, 2012 6:59 am    Post subject: Reply with quote

Akkara wrote:
The place to address this would be in autotools itself, if it can maintain a cache of recently-discovered answers to these questions, it would go a lot faster

It can and does maintain such a cache for internal usage (e.g. calling of "sub-configures"). You can also maintain this cache manually e.g. by setting CONFIG_SITE to some file name and setting some result variables there. Gentoo once even had such an automatically maintained cache as a package.
Unfortunately, it turned out practically impossible to use such a cache world-wide: Not only will you get into troubles e.g. after upgrades of basic parts of your toolchain, for practically every test there are some exceptional packages which for some reason or another want to get a different result (e.g. because they have intentionally a modified library path or similar reasons). So unless you maintain a lot of exceptions manually, you cannot use CONFIG_SITE at all; after playing around with it for a while (and having good experiences first), I gave it up after a while since the list of exceptions became too long and unexpected side results causing hard-to-track bugs too annoying.
Back to top
View user's profile Send private message
Yamakuzure
Advocate
Advocate


Joined: 21 Jun 2006
Posts: 2284
Location: Adendorf, Germany

PostPosted: Wed Dec 19, 2012 9:41 am    Post subject: Reply with quote

Well, he has one valid point with the "checking" of portage, though. Portage does check all files listed in the manifest and needed by the ebuild against various hashes. AFAIK this is done in a serial way, which takes a lot of time when ebuilds need only portions of a large bunch of large archives. (qt4 comes to my mind.)
_________________
Important German:
  1. "Aha" - German reaction to pretend that you are really interested while giving no f*ck.
  2. "Tja" - German reaction to the apocalypse, nuclear war, an alien invasion or no bread in the house.
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Wed Dec 19, 2012 3:50 pm    Post subject: Reply with quote

Yamakuzure wrote:
Portage does check all files listed in the manifest and needed by the ebuild against various hashes.

You can select that only one hash is used, e.g.
Code:
PORTAGE_CHECKSUM_FILTER='-* sha512'

Anyway, the main time for the checking is to read in the data, and this would not speed up when it is parallelized (if the data is not already in cache).
Back to top
View user's profile Send private message
_______0
Guru
Guru


Joined: 15 Oct 2012
Posts: 521

PostPosted: Fri Dec 21, 2012 1:39 pm    Post subject: Reply with quote

wow I am retarded,

Code:
emerge --jobs=N


somehow fixes the 'Checking...' problemo. But still not a clean solution.

Still not fully parallelized. It emerges them in blocks then installs then emerges again. During the install stage there could be more compiling. And also grep all the virtual/foo packages in one go.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum