View previous topic :: View next topic |
Author |
Message |
dE_logics Advocate
Joined: 02 Jan 2009 Posts: 2253 Location: $TERM
|
Posted: Thu Mar 03, 2011 1:10 am Post subject: |
|
|
This discussion has grown big.
I gave another thought to this the next day when I was emerging VirtualBox.
After a -pv, most of the info that portage works on is cached to ram. Thus, in this case (i.e second or third emerge with -pv) is bound to give advantage.
+ Multi threading is possible. You can calculate the dependency of a package in separate threads; that wont change the outcome. _________________ My blog |
|
Back to top |
|
|
Genone Retired Dev
Joined: 14 Mar 2003 Posts: 9532 Location: beyond the rim
|
Posted: Thu Mar 03, 2011 5:31 pm Post subject: |
|
|
Multi threading won't give much of a benefit anyway as dependency calculation can't be split up as much as you think, because decisions in one branch may influence decisions in other branches (e.g. any-of deps). Getting that right in a concurrent way would add so much overhead (and memory usage) to eat up any improvements, and would make debugging and maintenance much harder in the long run (debugging multithreaded apps is always a pain).
Also I'm not sure if this still applies, but python used to have some issues with multithreaded apps due to the global interpreter lock, in fact Guide recommended to use multiprocessing instead (which wouldn't be a good idea for dep resolution due to startup times and rather short-living processes).
And as has been mentioned, IO is always an issue with portage as it has to access (not necessarily read) up to many thousand files for a system-wide dep resolution.
While portage has had a lot of performance improvements in the last years, a good part is eaten up again by additional complexity in the dep resolved needed for new requirements like use-/slot-deps, improved package selection and EAPI support in general. |
|
Back to top |
|
|
Yamakuzure Advocate
Joined: 21 Jun 2006 Posts: 2284 Location: Adendorf, Germany
|
Posted: Fri Mar 04, 2011 11:45 am Post subject: |
|
|
sparc wrote: | Confused? Portage is a complex tool with too many options. The thing is that these options do not behave the same when combined together. See if I enter --emptytree above I get even shorter times (around 20 something seconds). The time changes if I enter --newuse also. So what is the problem, you might ask.
The problem is the following:
Code: |
# time emerge --pretend --update --deep --with-bdeps=y --update @world
These are the packages that would be merged, in order:
Calculating dependencies... done!
...
(a slightly longer package list)
...
real 3m31.638s
user 3m28.788s
sys 0m2.550s
|
What changed? A simple option called --deep, in combination with everything else. Whatever the case on YOUR system, I'm not going to continue following this thread. That is because in my book the 30 seconds that satisfy you all is still slow performance. And as a last remark, if you all exit your comfort zone and start playing around with portage's options, you too can make it run for 3-5 minutes. | Don't you ever read? Oh man this is absolutely ridiculous! I used all these Options, just see above! And I did not "make it to 3-5 minutes".
With this attitude you get yourself quicker on every ignore-list than is took to write this.
And again: Forget flat files, try sqlite. "man portage" -> "/ sqlite" -> read it, install "dev-python/pysqlite" -> create /etc/portage/modules like stated in the man page -> run "emerge --metadata" -> be happy. _________________ Important German:- "Aha" - German reaction to pretend that you are really interested while giving no f*ck.
- "Tja" - German reaction to the apocalypse, nuclear war, an alien invasion or no bread in the house.
|
|
Back to top |
|
|
Suicidal l33t
Joined: 30 Jul 2003 Posts: 959 Location: /dev/null
|
Posted: Fri Mar 04, 2011 1:05 pm Post subject: |
|
|
I got upto 1m6sec with this redicilious command.
utterly redundant and retarded emerge command: | emerge -uDDDepvN --with-bdeps=y --tree --color=y world |
The problem with most PhD's is that they already know everything. |
|
Back to top |
|
|
Ant P. Watchman
Joined: 18 Apr 2009 Posts: 6920
|
Posted: Fri Mar 04, 2011 10:42 pm Post subject: |
|
|
Suicidal wrote: | I got upto 1m6sec with this redicilious command. |
Ha, I got 0:55.92 on a cold cache and 0:12.57 on a re-run.
(which is already lightning-fast compared to the glacial time it takes for paludis: 3:05.09/1:30.81 for `cave resolve --everything installed-slots`, and that's *with* the ridiculous effort it takes to set up metadata caches in it!) |
|
Back to top |
|
|
Suicidal l33t
Joined: 30 Jul 2003 Posts: 959 Location: /dev/null
|
Posted: Fri Mar 04, 2011 11:41 pm Post subject: |
|
|
Ohh snap,
I forgot to add --metadata to that string ;- ) |
|
Back to top |
|
|
jamapii l33t
Joined: 16 Sep 2004 Posts: 637
|
Posted: Mon Mar 07, 2011 6:23 pm Post subject: |
|
|
I use emerge -j4 with portage 2.2
This isn't perfect, but should be better than nothing.
However, sometimes builds fail, and then succeed when building serially. |
|
Back to top |
|
|
_______0 Guru
Joined: 15 Oct 2012 Posts: 521
|
Posted: Tue Oct 16, 2012 8:48 am Post subject: good title thread wrong useless implementation. |
|
|
mm..
The title of the thread instantly caught my attention as I also find some things portage does utterly WTFs. The parallelization being talk here is good but not useful.
I think compiling world in gentoo would be cut by as much as 3/4 just by optimizing different stages of the emerging process. As cores keep increasing there are certain stages of the emerge process that look retarded and slow down the overall updating. Indeed packages do emerge faster but tha part that'd I've noticed that slows down installing stuff is simple things like "checking".
To illustrate my point let's examine the gcc ebuild. Emerge spends much of the initial time and other parts just doing dumb tasks like the followings:
Code: | * Applying Gentoo patches ...
* 01_all_joined-cpp-defs.patch ... [ ok ]
* 03_all_java-nomulti.patch ... [ ok ]
* 05_all_gcc-4.6.x-siginfo.patch ... [ ok ]
* 10_all_default-fortify-source.patch ... [ ok ]
* 11_all_default-warn-format-security.patch ... [ ok ]
* 12_all_default-warn-trampolines.patch ... [ ok ]
* 15_all_libgfortran-Werror.patch ... [ ok ]
* 15_all_libgomp-Werror.patch ... [ ok ]
* 16_all_libgo-Werror-pr53679.patch ... [ ok ]
* 25_all_alpha-mieee-default.patch ... [ ok ]
* 26_all_alpha-asm-mcpu.patch ... [ ok ]
* 29_all_arm_armv4t-default.patch ... [ ok ]
* 33_all_armhf.patch ... [ ok ]
* 34_all_ia64_note.GNU-stack.patch ... [ ok ]
* 38_all_sh_pr24836_all-archs.patch ... [ ok ]
* 42_all_superh_default-multilib.patch ... [ ok ]
* 50_all_libiberty-asprintf.patch ... [ ok ]
* 51_all_libiberty-pic.patch ... [ ok ]
* 52_all_netbsd-Bsymbolic.patch ... [ ok ]
* 64_all_gcc-hppa-64bit-pr52408.patch ... [ ok ]
* 65_all_gcc-hppa-section-conflicts-pr52999.patch ... [ ok ]
* 74_all_gcc46_cloog-dl.patch ... [ ok ]
* 76_all_4.7.0_c-family-headers.patch ... [ ok ]
* 92_all_freebsd-pie.patch ... [ ok ]
* Done with patching
* Applying uClibc patches ...
* 90_all_100-uclibc-conf.patch ... [ ok ]
* 90_all_301-missing-execinfo_h.patch ... [ ok ]
* 90_all_302-c99-snprintf.patch ... [ ok ]
* 90_all_305-libmudflap-susv3-legacy.patch ... [ ok ]
* Done with patching
* Applying pie patches ...
* 10_all_gcc45_configure.patch ... [ ok ]
* 11_all_gcc45_config.in.patch ... [ ok ]
* 12_all_gcc46_Makefile.in.patch ... [ ok ]
* 13_all_gcc46_ssp_uclibc_check.patch ... [ ok ]
* 20_all_gcc46_gcc.c.patch ... [ ok ]
* 21_all_gcc44_decl-tls-model.patch ... [ ok ]
* 22_all_gcc46-default-ssp.patch ... [ ok ]
* 30_all_gcc46_esp.h.patch ... [ ok ]
* 33_all_gcc46_config_rs6000_linux64.h.patch ... [ ok ]
* 35_all_gcc46_config_crtbeginp.patch ... [ ok ]
* 60_all_gcc44_invoke.texi.patch ... [ ok ]
* Done with patching |
And this:
Code: | checking for kill... yes
checking for getrlimit... yes
checking for setrlimit... yes
checking for atoll... yes
checking for atoq... no
checking for sysconf... yes
checking for strsignal... yes
checking for getrusage... yes
checking for nl_langinfo... yes
checking for gettimeofday... yes
checking for mbstowcs... yes
checking for wcswidth... yes
checking for mmap... yes
checking for setlocale... yes
checking for clearerr_unlocked... yes
checking for feof_unlocked... yes
checking for ferror_unlocked... yes
checking for fflush_unlocked... yes
checking for fgetc_unlocked... yes
checking for fgets_unlocked... yes
checking for fileno_unlocked... yes
checking for fprintf_unlocked... no
checking for fputc_unlocked... yes
checking for fputs_unlocked... yes
checking for fread_unlocked... yes
checking for fwrite_unlocked... yes
checking for getchar_unlocked... yes
checking for getc_unlocked... yes
checking for putchar_unlocked... yes
checking for putc_unlocked... yes
checking whether mbstowcs works... yes
checking for ssize_t... yes
checking for caddr_t... yes
checking for sys/mman.h... (cached) yes
checking for mmap... (cached) yes
checking whether read-only mmap of a plain file works... yes
checking whether mmap from /dev/zero works... yes
checking for MAP_ANON(YMOUS)... yes
checking whether mmap with MAP_ANON(YMOUS) works... yes
checking for pid_t... yes
checking for vfork.h... no
checking for fork... yes
checking for vfork... yes
checking for working fork... yes
checking for working vfork... (cached) yes
checking for ld used by GCC... /usr/lib/gcc/x86_64-pc-linux-gnu/4.6.3/../../../../x86_64-pc-linux-gnu/bin/ld
checking if the linker (/usr/lib/gcc/x86_64-pc-linux-gnu/4.6.3/../../../../x86_64-pc-linux-gnu/bin/ld) is GNU ld... yes
checking for shared library run path origin... done
checking for iconv... yes
checking for iconv declaration... install-shextern size_t iconv (iconv_t cd, char * *inbuf, size_t *inbytesleft, char * *outbuf, size_t *outbytesleft);
checking for LC_MESSAGES... yes
checking for nl_langinfo and CODESET... yes
checking whether getenv is declared... yes
checking whether getenv is declared... yes
checking whether atol is declared... yes
checking whether asprintf is declared... yes
checking whether sbrk is declared... yes
checking whether abort is declared... yes
checking whether atof is declared... yes
checking whether getcwd is declared... yes
checking whether getwd is declared... yes
checking whether strsignal is declared... yes
checking whether strstr is declared... yes
checking whether strverscmp is declared... yes
checking whether errno is declared... yes
checking whether snprintf is declared... yes
checking whether vsnprintf is declared... yes
checking whether vasprintf is declared... yes
checking whether malloc is declared... yes
checking whether realloc is declared... yes
checking whether calloc is declared... yes
checking whether free is declared... yes
checking whether basename is declared... yes
checking whether getopt is declared... no
checking whether clock is declared... yes
checking whether getpagesize is declared... yes
checking whether clearerr_unlocked is declared... yes
checking whether feof_unlocked is declared... yes
checking whether ferror_unlocked is declared... yes
checking whether fflush_unlocked is declared... yes
checking whether fgetc_unlocked is declared... yes
checking whether fgets_unlocked is declared... yes
checking whether fileno_unlocked is declared... yes
checking whether fprintf_unlocked is declared... no
checking whether fputc_unlocked is declared... yes
checking whether fputs_unlocked is declared... yes
checking whether fread_unlocked is declared... yes
checking whether fwrite_unlocked is declared... yes
checking whether getchar_unlocked is declared... yes
checking whether getc_unlocked is declared... yes
checking whether putchar_unlocked is declared... yes
checking whether putc_unlocked is declared... yes
checking whether getrlimit is declared... yes
checking whether setrlimit is declared... yes
checking whether getrusage is declared... yes
checking whether ldgetname is declared... no
checking whether times is declared... yes
checking whether sigaltstack is declared... yes
checking for struct tms... yes
checking for clock_t... yes
checking for .preinit_array/.init_array/.fini_array support... yes
|
And this:
Code: | Searching /usr/include/.
Searching /usr/include/./libunrar
Searching /usr/include/./postgresql
Searching /usr/include/./libpq
Searching /usr/include/./boost
Searching /usr/include/./schily/scg
Searching /usr/include/./scsilib
Searching /usr/include/./quicktime
|
And this:
Code: |
Applying io_quotes_use to linux/uinput.h
Applying io_quotes_use to linux/usb/tmc.h
Applying io_quotes_use to linux/input.h
Applying io_quotes_use to linux/suspend_ioctls.h
Applying io_quotes_use to linux/ptp_clock.h
Applying io_quotes_use to linux/reiserfs_fs.h
Applying io_quotes_use to linux/cm4000_cs.h
Applying io_quotes_use to linux/watchdog.h
Applying io_quotes_use to linux/pktcdvd.h
Applying io_quotes_use to linux/synclink.h
Applying machine_name to linux/a.out.h
Fixed: linux/a.out.h
Applying io_quotes_use to linux/kvm.h
Applying io_quotes_use to linux/dm-ioctl.h
Applying io_quotes_use to linux/agpgart.h
Applying io_quotes_use to linux/dn.h
Applying io_quotes_use to linux/mmtimer.h
Applying io_quotes_use to linux/random.h
Applying io_quotes_def to linux/version.h
Applying io_quotes_use to linux/nbd.h
Applying io_quotes_use to linux/atmbr2684.h
Applying io_quotes_use to linux/ppdev.h
Applying io_quotes_use to linux/phantom.h
Applying io_quotes_use to linux/media.h
Applying io_quotes_use to linux/mmc/ioctl.h
Applying io_quotes_use to linux/raw.h
Applying io_quotes_use to linux/spi/spidev.h
Applying io_quotes_use to linux/blkpg.h
Applying io_quotes_use to linux/auto_fs.h
Applying io_quotes_use to linux/i2o-dev.h
Applying io_quotes_use to linux/fd.h
Applying io_quotes_use to linux/vhost.h
Applying io_quotes_use to linux/auto_fs4.h
Applying io_quotes_use to linux/cciss_ioctl.h
Applying io_quotes_use to linux/raid/md_u.h
Applying io_quotes_use to linux/ipmi.h
Applying io_quotes_def to linux/pci_regs.h
Applying io_quotes_def to linux/ppp-comp.h
Applying io_quotes_use to linux/rfkill.h
Applying io_quotes_use to linux/gigaset_dev.h
Applying io_quotes_use to linux/fs.h
Applying io_quotes_def to linux/soundcard.h
Applying io_quotes_use to linux/omapfb.h
Applying io_quotes_use to linux/if_pppox.h
Applying io_quotes_use to linux/hsi/hsi_char.h
Applying io_quotes_use to linux/vfio.h
Applying pthread_incomplete_struct_argument to pthread.h
Applying hpux8_bogus_inlines to math.h
Applying ctrl_quotes_def to readline/chardefs.h
Applying io_quotes_use to mtd/ubi-user.h
Applying io_quotes_use to sys/mount.h
Applying io_quotes_use to sys/raw.h
Applying glibc_stdint to stdint.h
Applying io_quotes_use to asm/mtrr.h
|
Code: | >>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/as.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/gprof.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/addr2line.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/ar.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/dlltool.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/nlmconv.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/nm.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/objcopy.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/objdump.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/ranlib.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/readelf.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/size.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/strings.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/strip.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/elfedit.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/windres.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/windmc.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/c++filt.1.bz2
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/man/man1/ld.1.bz2
--- /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/ld.info
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/binutils.info
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/bfd.info
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/gprof.info
>>> /usr/share/binutils-data/x86_64-pc-linux-gnu/2.22/info/as.info
>>> Safely unmerging already-installed instance...
No package files given... Grabbing a set.
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/strip
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/strings
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/size
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/readelf
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/ranlib
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/objdump
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/objcopy
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/nm
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/ld.gold
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/ld.bfd
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/ld
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/gprof
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/elfedit
--- replaced obj /usr/x86_64-pc-linux-gnu/binutils-bin/2.22/c++filt
|
These stages are present in nearly ALL packages. It often happens, on multiple corez, that compiling is faster, in as blink of an eye, than the "checking" process!! I am sure ppl know what I am talking about just observe brutal "checking" of binutils, it's like eternal. Now multiply the time emerge spends with during these steps by 1000 or more packages installed on a system!!!
Now, I ain't no h4x0r, so I can't go into the codez and fix it myself but the thing doesn't look like it'll be complicated to fix. It doesn't require any code optimization since there's no gcc or C code to compile. It's only a script that does stuff serialy, for any of the gentoo coders should be a piece of cake to fix the parallelization of these ebuild stages.
I thought about this one time when I was recursively extracting with 7z. Since 7z doesn't have wildcard decompression/extraction (to my surprise as the rest of compressing tools, bzip, tar, etc). The only solution in out there on the webz where the usual "for i in ...". Seeing how it did it one by one I searched more 7z options and added maximum parallelization but "for i in ..." was still one at a time.
The I did some experiment whether there was a faster way. Don't laugh if this looks rudimentary. What I did is add all 500 7z packages an a single line with && manually. And indeed there was a drastic change in the time decompressing!! The solution looked like this:
Code: | 7z x foo && 7z x bar && <500 hundred times> |
I don't know how emerge does those checking with "for in in ..." or what but nowadays with many cores, and massive amounts of ram "for i in ..." looks outdated and NOT a solution. In general "for i in ..." be for converting songs from one format to another, to use with ffmpeg, to move things, etc is not being kept up to date with hardware advances. There's more parallel optimization at compiler level (harder to figure it out) than a simple script!! Scripts don't suffer from race condition either, so I think it should be trivial to fix this speed bottleneck in emerge.
Plz, somebody implement my ideas on gcc ebuild, is not hard. Something like substituting the part:
Code: | for in in [b]checking[/b] ... |
With:
Code: | checking whether ... && checking whether ... && checking whether ... && checking whether ... && checking whether ... && ... |
As I said in the beginning by my estimates tweaking the ebuilds this weay should cut compiling world time by a brutal 3/4.
Can someone provide a fixed gcc ebuild today to see the difference??
thnx in advanced. |
|
Back to top |
|
|
Scimmia22 n00b
Joined: 24 Sep 2012 Posts: 10
|
Posted: Tue Oct 16, 2012 9:35 am Post subject: |
|
|
________________________0, you're completely missing the fact that the "Checking" isn't a function of the ebuild or portage, but of the configure script included as part of the source code to setup the makefiles. |
|
Back to top |
|
|
boerKrelis Apprentice
Joined: 01 Jul 2003 Posts: 241 Location: The Netherlands
|
Posted: Tue Oct 16, 2012 9:43 am Post subject: |
|
|
________________________0, you are unclear on the concepts. Portage != automake/autoconf .
Also, you are talking about two things: 1) portage doing stuff that you think is unnecessary 2) portage not doing various stuff in parallel.
Also, many things you mention either cannot be parallelized (e.g., applying patches should be done in-order) or are I/O-bound.
Also, O("for thing in array_of_things; do $thing") == O(thing1 && thing2 && thing2 && thingN) == O(n) (if thingN == O(1)).
You ran
Code: |
7z x foo && 7z x bar && <500 hundred times>
|
which runs them *serially*. Your observed speed boost in your second experiment most probably comes from the file & dentry cache; you should drop your caches before you benchmark (I think I mentioned earlier on in this thread how to do that), otherwise you'll be benchmarking the cache.
If you want to parallelize either of these semantically identical constructs you should investigate the upstream gcc build process (autoconf/automake) and parallelize _that_, portage is not involved here. Portage can then take advantage of your upstream parallelization patches through MAKEOPTS.
So if you want to improve your personal portage experience you'll either need to develop an understanding of building software and the portage/build system boundaries (== "become h4x0r") , or run emerge with "--quiet-build=y" as to lessen the confusion caused by all these messages of which you do not yet fully grasp purpose or meaning ;-)
Portage supports parallelization through MAKEOPTS="-jN" and "--jobs".
Good luck on your journey into the not-so-simple world of software! |
|
Back to top |
|
|
pigeon768 l33t
Joined: 02 Jan 2006 Posts: 683
|
Posted: Thu Oct 18, 2012 10:32 pm Post subject: |
|
|
sparc wrote: | On my faulty machine I get the following: Code: | # time emerge --pretend --update --with-bdeps=y --update @world
These are the packages that would be merged, in order:
Calculating dependencies... done!
...
(package list)
...
real 0m36.832s
user 0m36.301s
sys 0m0.461s | Confused? Portage is a complex tool with too many options. The thing is that these options do not behave the same when combined together. See if I enter --emptytree above I get even shorter times (around 20 something seconds). The time changes if I enter --newuse also. So what is the problem, you might ask.
The problem is the following: Code: | # time emerge --pretend --update --deep --with-bdeps=y --update @world
These are the packages that would be merged, in order:
Calculating dependencies... done!
...
(a slightly longer package list)
...
real 3m31.638s
user 3m28.788s
sys 0m2.550s | What changed? A simple option called --deep, in combination with everything else. Whatever the case on YOUR system, I'm not going to continue following this thread. That is because in my book the 30 seconds that satisfy you all is still slow performance. And as a last remark, if you all exit your comfort zone and start playing around with portage's options, you too can make it run for 3-5 minutes.
cheers | Code: | ~ $ time emerge -uNDpv @world --with-bdeps y
These are the packages that would be merged, in order:
Calculating dependencies... done!
Total: 0 packages, Size of downloads: 0 kB
real 0m14.321s
user 0m14.145s
sys 0m0.142s | -D is --deep.
If it's actually taking 3 1/2 minutes to do an emerge -uNDpv @world, you have other problems. There are some things I'd like to see portage do faster, but I'm not terribly concerned about the lack of multithreading. Multithreading in python sucks eggs anyway. |
|
Back to top |
|
|
_______0 Guru
Joined: 15 Oct 2012 Posts: 521
|
Posted: Tue Dec 18, 2012 6:50 pm Post subject: |
|
|
boerKrelis wrote: | ________________________0, you are unclear on the concepts. Portage != automake/autoconf .
Also, you are talking about two things: 1) portage doing stuff that you think is unnecessary 2) portage not doing various stuff in parallel.
Also, many things you mention either cannot be parallelized (e.g., applying patches should be done in-order) or are I/O-bound.
Also, O("for thing in array_of_things; do $thing") == O(thing1 && thing2 && thing2 && thingN) == O(n) (if thingN == O(1)).
You ran
Code: |
7z x foo && 7z x bar && <500 hundred times>
|
which runs them *serially*. Your observed speed boost in your second experiment most probably comes from the file & dentry cache; you should drop your caches before you benchmark (I think I mentioned earlier on in this thread how to do that), otherwise you'll be benchmarking the cache.
If you want to parallelize either of these semantically identical constructs you should investigate the upstream gcc build process (autoconf/automake) and parallelize _that_, portage is not involved here. Portage can then take advantage of your upstream parallelization patches through MAKEOPTS.
So if you want to improve your personal portage experience you'll either need to develop an understanding of building software and the portage/build system boundaries (== "become h4x0r") , or run emerge with "--quiet-build=y" as to lessen the confusion caused by all these messages of which you do not yet fully grasp purpose or meaning
Portage supports parallelization through MAKEOPTS="-jN" and "--jobs".
Good luck on your journey into the not-so-simple world of software! |
'good luck' for which 'purpose and meaning'??? u's crazy. First off you would have deduce that I MADE A MISTAKE and meant & and NOT &&. If I am speaking about parallelization it's quite obvious that I would not be speaking about &&, more over && has a linear progress on the terminal.
So do run an experiment with 7z and & then check htop. Is quite amusing.
Also parallelization of autoconf, automake and binutils (make or whatever the scripts are) is NOT about my personal pleasure ('personal experience'). My idea is to parallelize the "Checking ..." stage to take an instant instead of several minutes. Why has nobody parallelize this?? I can't wait. |
|
Back to top |
|
|
Ant P. Watchman
Joined: 18 Apr 2009 Posts: 6920
|
Posted: Tue Dec 18, 2012 9:07 pm Post subject: |
|
|
________________________0 wrote: | Also parallelization of autoconf, automake and binutils (make or whatever the scripts are) is NOT about my personal pleasure ('personal experience'). My idea is to parallelize the "Checking ..." stage to take an instant instead of several minutes. Why has nobody parallelize this?? I can't wait. |
http://www.gnu.org/software/autoconf/autoconf.html
I'm sure they'd love to hear your idea. Gentoo has nothing to do with the development of the GNU toolchain. |
|
Back to top |
|
|
Akkara Bodhisattva
Joined: 28 Mar 2006 Posts: 6702 Location: &akkara
|
Posted: Wed Dec 19, 2012 2:15 am Post subject: |
|
|
________________________0 wrote: | First off you would have deduce that I MADE A MISTAKE and meant & and NOT &&. If I am speaking about parallelization it's quite obvious that I would not be speaking about &&, more over && has a linear progress on the terminal. |
Wasn't obvious here, either. I read your post, saw the '&&', and I did in fact wonder,"is && really that much faster than for i in...". I was preparing to go try it myself to check, when I read your reply. Both are valid syntax. And when presented with a unexpected claim, made in a thread about improving performance, my first thought was indeed, is there something wrong with the shell's parsing method that needs to be brought to attention?
But, as others have said, all this "checking..." stuff is not portage. It's part of the package's build system. Just so happens many packages use "autotools" for configuration assistance. It's got lots of history. Used to be, when coding C, in one platform you'd have to include <strings.h>. In another it might be <string.h>. Even now, less standard stuff could be in different places. And for packages that work in both linux and MS.Windows, lots of stuff needs to be done differently. That's what's behind all those "checking" messages: it's finding out how to do the thing in question on the computer you're currently running.
And, you're right. Most packages check for much of the same things. The place to address this would be in autotools itself, if it can maintain a cache of recently-discovered answers to these questions, it would go a lot faster.
But even that isn't as simple as it sounds: Imagine a music player. Auto-configures to use all the codecs you've got installed. "Checking for ogg..."; "Checking for flac..."; "Checking for mp3..."; and so on. Now you install a new codec and go re-compile the player app. With cached autotools answers, it'll never detect the new one, because it already "knows" the answer is "no".
It's not a easy problem to solve without introducing new subtle problems. _________________ Many think that Dilbert is a comic. Unfortunately it is a documentary. |
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Wed Dec 19, 2012 6:59 am Post subject: |
|
|
Akkara wrote: | The place to address this would be in autotools itself, if it can maintain a cache of recently-discovered answers to these questions, it would go a lot faster |
It can and does maintain such a cache for internal usage (e.g. calling of "sub-configures"). You can also maintain this cache manually e.g. by setting CONFIG_SITE to some file name and setting some result variables there. Gentoo once even had such an automatically maintained cache as a package.
Unfortunately, it turned out practically impossible to use such a cache world-wide: Not only will you get into troubles e.g. after upgrades of basic parts of your toolchain, for practically every test there are some exceptional packages which for some reason or another want to get a different result (e.g. because they have intentionally a modified library path or similar reasons). So unless you maintain a lot of exceptions manually, you cannot use CONFIG_SITE at all; after playing around with it for a while (and having good experiences first), I gave it up after a while since the list of exceptions became too long and unexpected side results causing hard-to-track bugs too annoying. |
|
Back to top |
|
|
Yamakuzure Advocate
Joined: 21 Jun 2006 Posts: 2284 Location: Adendorf, Germany
|
Posted: Wed Dec 19, 2012 9:41 am Post subject: |
|
|
Well, he has one valid point with the "checking" of portage, though. Portage does check all files listed in the manifest and needed by the ebuild against various hashes. AFAIK this is done in a serial way, which takes a lot of time when ebuilds need only portions of a large bunch of large archives. (qt4 comes to my mind.) _________________ Important German:- "Aha" - German reaction to pretend that you are really interested while giving no f*ck.
- "Tja" - German reaction to the apocalypse, nuclear war, an alien invasion or no bread in the house.
|
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Wed Dec 19, 2012 3:50 pm Post subject: |
|
|
Yamakuzure wrote: | Portage does check all files listed in the manifest and needed by the ebuild against various hashes. |
You can select that only one hash is used, e.g.
Code: | PORTAGE_CHECKSUM_FILTER='-* sha512' |
Anyway, the main time for the checking is to read in the data, and this would not speed up when it is parallelized (if the data is not already in cache). |
|
Back to top |
|
|
_______0 Guru
Joined: 15 Oct 2012 Posts: 521
|
Posted: Fri Dec 21, 2012 1:39 pm Post subject: |
|
|
wow I am retarded,
somehow fixes the 'Checking...' problemo. But still not a clean solution.
Still not fully parallelized. It emerges them in blocks then installs then emerges again. During the install stage there could be more compiling. And also grep all the virtual/foo packages in one go. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|