| View previous topic :: View next topic |
| Author |
Message |
CoolAJ86 n00b


Joined: 24 May 2004 Posts: 61 Location: Shelburne, VT
|
Posted: Mon Apr 04, 2005 11:40 pm Post subject: 2005.0 Desktop PC - 'being fast' vs 'feeling fast' |
|
|
Although I could totally rice my system again, as I prepare to reinstall my system using kernel 2.6, udev, nptl, and whatever other big changes have occured I'd rather get it right all at once this time. Sure, I could emerge -e, but there are sure to be problems that I have to look into, and it'd be a waste of time when there are so many packages I never use, just installed to play with.
I'm not running a server. I don't care if things actually are fast - I just want it to feel fast! I don't care if my browser uses 2.3% less CPU utilization or not - I'd much rather it load faster because I've got CPU to burn and memory is a commodity. On the other hand, I don't want to completely leave my system unoptimized either. I want to find a nice balance that leans a bit more towards 'feeling' fast where applicable.
I did some homework and after reading
this: http://gentoo-wiki.com/CFLAGS
this: https://forums.gentoo.org/viewtopic-t-309752-highlight-ricers.html
this: http://gcc.gnu.org/onlinedocs/gcc-3.4.0/gcc/Optimize-Options.html
this: https://forums.gentoo.org/viewtopic-t-67777-highlight-ldflag.html
this: http://gentoo-wiki.com/HOWTO_Install_Gentoo_-_The_Gentoo_Developers_Method_with_NPTL_and_2.6_from_Stage1#Updated_make.conf
and a few other assorted docs, I've come up with this
| Code: | /etc/make.conf: (revised with suggestions)
CHOST="i686-pc-linux-gnu"
# Small binaries for fast load times, but reasonably optimized
CFLAGS="-Os -march=i686 -mtune=athlon-xp -pipe -falign-functions=4 -fweb -D_FILE_OFFSET_BITS=64"
CXXFLAGS="${CFLAGS}"
# A little bit 'a rice
LDFLAGS="-Wl,-O1 -Wl,--sort-common -z combreloc -Wl,--enable-new-dtags" # -Wl,--relax"
MAKEOPTS="-j3"
AUTOCLEAN="yes"
EMERGE_NICENESS="15"
FEATURES="candy ccache buildpkg digest fixpackages" # sfperms sandbox userpriv usersandbox
CCACHE_SIZE="2G"
## USE ##
# HARDWARE
CPU="3dnow gcj gcc mmx mmx2 nptl nptlonly pic sse threads x86"
PM="acpi -apm"
VIDEO="nvidia mythtv xinerama -3dfx"
AUDIO="audiofile alsa -arts -esd openal"
BLK_DEV="cdr cdparanoia dvd dvdr"
NET="howl -ipv6 -nas samba tcpd"
OTHER="cups foomaticdb gphoto2 gimp-print gpm hal -pcmcia pda ppds scanner -joystick usb xprint"
##
HARDWARE="${CPU} ${PM} ${VIDEO} ${AUDIO} ${BLK_DEV} ${NET} ${OTHER}"
# SOFTWARE
SYS_AUTH="berkdb crypt hardened -imap innodb ldap mysql pam pam-mysql -selinux ssl"
X11="X accessibility artworkextra bonobo gnome gtk gtk2 gtkhtml -Xaw3d -qt -kde"
MMX="avi divx4linux dv encode exif flac flash -gd gif gimp gstreamer imagemagick jpeg mad mime ming mpeg ogg oggvorbis pdflib png quicktime tiff wmf xine xmms xvid" #fdftk
WWW="aim apache2 icq ftp jabber libwww maildir mozilla msn nntp oscar yahoo xml xml2"
PROG="bash-completion curl java mono pcre perl php python" # cscope ruby"
MISC="gnomedb gnutls spell tcltk tidy truetype"
##
SOFTWARE="${SYS_AUTH} ${X11} ${MMX} ${PROG} ${WWW} ${MISC}"
USE="${HARDWARE} ${SOFTWARE}"
# ALSA_CARDS="via82cxx"
ALSA_CARDS="emu10k1"
DISTDIR=/usr/portage/distfiles
PKGDIR=/usr/portage/packages
PORTDIR_OVERLAY=/usr/local/portage |
any other ideas to boost speed while maintaining a small binary size? _________________ "May the source be with you."
Laterz-
~CoolAJ86
http://CoolAJ86.Havenite.net
http://www.uvm.org/vague - LUG VT
Last edited by CoolAJ86 on Sun May 01, 2005 8:56 pm; edited 4 times in total |
|
| Back to top |
|
 |
beavsux n00b


Joined: 02 Apr 2005 Posts: 37
|
Posted: Tue Apr 05, 2005 1:42 am Post subject: Re: 2005.0 Desktop PC - 'being fast' vs 'feeling fast' |
|
|
| CoolAJ86 wrote: |
/etc/make.conf:
CFLAGS="-Os -march=athlon-xp -pipe -msse -mfpmath=sse -falign-functions=64 -fforce-addr -D_FILE_OFFSET_BITS=64"
CXXFLAGS="${CFLAGS}"
LDFLAGS="-Wl,-O1 -Wl,--relax -Wl,--enable-new-dtags -Wl,--sort-common -s -z combreloc"
any other ideas to boost speed while maintaining a small binary size? |
-Os will reduce binary size at the expense of speed. The wiki says "This will probably decrease load times for large applications such as Mozilla", but there are some optmizations that will increase speed, like loop unrolling and inlining, that won't be used with -Os.
--sort-common will take your code out of alignment. This decreases binary size at the expense of speed.
-s is done already at the end of an emerge, when the binaries are stripped (man strip)
-falign-functions should already be set to something reasonable by the compiler. If you change -Os to -O2, you'll get it in there anyway. Same goes for D_FILE_OFFSET_BITS.
-fforce-addr is straight up bad. GCC has internal heuristics to know when a memory value has to be copied to a register, and forcing them to be copied to registers can greatly degrade performance, by taking registers away from better uses.
If I were you, I'd go with the following:
| Code: |
CFLAGS="-O2 -march=athlon-xp -pipe -funroll-loops"
CXXFLAGS="${CFLAGS}"
LDFLAGS="-Wl,-O1 -Wl,--relax -z combreloc"
|
LDFLAGS could be left blank too. Post-linker optimizations aren't that great in most cases. |
|
| Back to top |
|
 |
thechris Veteran

Joined: 12 Oct 2003 Posts: 1203
|
Posted: Tue Apr 05, 2005 8:31 am Post subject: |
|
|
Search for the long acovea thread. i have some findings in there.
should use:
-O2 - use this as a base -Os is also fine
-fweb -- should be good and safe
-march=athlon-xp -- makes sense
-pipe -- standard
do not need:
-msse -- implied by -march=athlon-xp
-falign-functions=64 -- in my testings, i've seen the values of 4 or 16 come out better. try 4
should not use:
-mfpmath=sse -- explicitly setting this rarely helps and very often comes out signifigantly worse.
(-fnew-ra) -- this didn't work right...
look into a different desktop environment. i've had good luck wiht KDE, likely becasue the apps don't need to load new libs all the time...
also look into "prelink"
possibly try different IO and CPU schedulers
some people used to renice X. not sure if that is a good idea though. _________________ HW problems. It's a VIA thing. |
|
| Back to top |
|
 |
Bob P Advocate


Joined: 20 Oct 2004 Posts: 3355 Location: Jackass! Development Labs
|
Posted: Tue Apr 05, 2005 8:49 am Post subject: Re: 2005.0 Desktop PC - 'being fast' vs 'feeling fast' |
|
|
| CoolAJ86 wrote: | | Although I could totally rice my system again, as I prepare to reinstall my system using kernel 2.6, udev, nptl, and whatever other big changes have occured I'd rather get it right all at once this time. Sure, I could emerge -e, but there are sure to be problems that I have to look into, and it'd be a waste of time when there are so many packages I never use, just installed to play with. |
use the GCC 3.4.3 toolkit and the Officially recommended CFLAGS for your hardware and you'll have the best combination of performance and reliability. if you're going to be using a 2.6 kernel, forget about ricing with CFLAGS. your biggest performance to be gained is by activating USE="nptl". _________________ .
Stage 1/3 | Jackass! | Rockhopper! | Thanks | Google Sucks |
|
| Back to top |
|
 |
CoolAJ86 n00b


Joined: 24 May 2004 Posts: 61 Location: Shelburne, VT
|
Posted: Tue Apr 05, 2005 4:25 pm Post subject: stage1 not up-to-date enough? |
|
|
Thanks for the help, I've updated what I think some of my vars should be based on your comments.
I've started a stage1 install with the flags above, but I had to remove -fweb otherwise gcc complains to me that the gcc compiler could not create executables. I think this is because I'm using gcc 3.3.5, not the latest version.
If that's the case, then it kinda makes a stage1 install seem pretty pointless...
Would it be better to start with a stage3, emerge the latest gcc (not sure how I should set up my make.conf for that...), and then `emerge -e`
thoughts?
oh, and... what are the IO / CPU schedulers? _________________ "May the source be with you."
Laterz-
~CoolAJ86
http://CoolAJ86.Havenite.net
http://www.uvm.org/vague - LUG VT |
|
| Back to top |
|
 |
beavsux n00b


Joined: 02 Apr 2005 Posts: 37
|
Posted: Tue Apr 05, 2005 5:27 pm Post subject: Re: stage1 not up-to-date enough? |
|
|
| CoolAJ86 wrote: |
Would it be better to start with a stage3, emerge the latest gcc (not sure how I should set up my make.conf for that...), and then `emerge -e`
thoughts?
oh, and... what are the IO / CPU schedulers? |
From what I understand, doing a stage 3 and then emerge -e system (to update the toolchain) and then emerge -e world (to build the whole system with the new toolchain) is nearly the same as a stage 1 install, and you get a working system much more quickly.
IO/CPU schedulers can be found as options in the kernel source. |
|
| Back to top |
|
 |
Bob P Advocate


Joined: 20 Oct 2004 Posts: 3355 Location: Jackass! Development Labs
|
Posted: Tue Apr 05, 2005 5:46 pm Post subject: Re: stage1 not up-to-date enough? |
|
|
| beavsux wrote: | | From what I understand, doing a stage 3 and then emerge -e system (to update the toolchain) and then emerge -e world (to build the whole system with the new toolchain) is nearly the same as a stage 1 install, and you get a working system much more quickly. |
well, yes and no. its always quicker to do a stage 1 than it is to do a stage 3 and rebuild. the reason for this is that you can't just make one pass with the --emptytree option and have everything work properly. you actually have to emerge -e system twice because portage does not build the toolkit components in the right order. ditto for the world files.
your suggested method will actually work if you do it from a fresh stage 3 install. although you hadn't realized it, on a stage 3 tarball the system files and the world files are the same. so when you emerge -e system and emerge -e world, you're actually performing the redundant "emerge -e system && emerge -e system" that i had mentioned previously. this would not be the case once you install any programs. then you'd have to "emerge -e system && emerge -e system && emerge -e world && emerge -e world." that takes ALOT longer than bootstrapping. _________________ .
Stage 1/3 | Jackass! | Rockhopper! | Thanks | Google Sucks |
|
| Back to top |
|
 |
beavsux n00b


Joined: 02 Apr 2005 Posts: 37
|
Posted: Wed Apr 06, 2005 12:49 am Post subject: Re: stage1 not up-to-date enough? |
|
|
| Bob P wrote: | | beavsux wrote: | | From what I understand, doing a stage 3 and then emerge -e system (to update the toolchain) and then emerge -e world (to build the whole system with the new toolchain) is nearly the same as a stage 1 install, and you get a working system much more quickly. |
well, yes and no. its always quicker to do a stage 1 than it is to do a stage 3 and rebuild... then you'd have to "emerge -e system && emerge -e system && emerge -e world && emerge -e world." that takes ALOT longer than bootstrapping. |
I was unclear in my previous statement. To me, any compilation that occurs after you have a working system is free in terms of time. My method avoids having to wait around for a working system (which is especially good when helping others install Gentoo). When PORTAGE_NICENESS is set to something high, the compile isn't even noticeable on a sufficiently fast system. You are correct though that the stage3 re-emerge method takes far more actual CPU time, and isn't as clean of a procedure as a stage1 emerge.
The method I use is for people with binary packages installed (X.org, gaim, etc), not plain stage 3's (even though I said that before. My mistake.). For a plain stage 3, Bob P is correct. At this point I'm doubting my whole installation procedure, however.
This is what I do. My goal is to get the system to a usable state quickly, then emerge in the background:
install a stage 3, and use emerge -K to quickly get many applications onto the system from the 2005.0 packages CD.
rebuild the kernel
set up make.conf and do basic configuration
get all the hardware working properly
emerge sync
emerge -uDv --newuse world
fix any remaining problems with the system
emerge -e world
Last edited by beavsux on Wed Apr 06, 2005 4:10 am; edited 1 time in total |
|
| Back to top |
|
 |
CoolAJ86 n00b


Joined: 24 May 2004 Posts: 61 Location: Shelburne, VT
|
Posted: Wed Apr 06, 2005 1:12 am Post subject: for future reference |
|
|
This has been really helpful. Thank you so much.
I just want to clarify about the importance of the bootstrap operation. So say I have a working system all built up - a stage4, if you will. If I were to dramatically change my use flags and rebuild the system with gcc-4.0 in 6 months and I wasn't reinstalling from a stage1, the cleanest and best way to go about that would be to run `/usr/portage/scripts/bootstrap.sh` and then `emerge -e world`? _________________ "May the source be with you."
Laterz-
~CoolAJ86
http://CoolAJ86.Havenite.net
http://www.uvm.org/vague - LUG VT |
|
| Back to top |
|
 |
beavsux n00b


Joined: 02 Apr 2005 Posts: 37
|
Posted: Wed Apr 06, 2005 4:01 am Post subject: |
|
|
| I'm not sure if the bootstrap script is needed (or what it does exactly on an install that is already completed, for that matter). You might just have to install the new toolchain through emerge, then emerge -e world |
|
| Back to top |
|
 |
Gentree Watchman


Joined: 01 Jul 2003 Posts: 5350 Location: France, Old Europe
|
Posted: Thu Apr 07, 2005 5:51 pm Post subject: |
|
|
There is a "developer's way" to do a stage1 on a running system that is supposed to be better the doing a clean stage1 . I was looking for it when I ended up here.
I think the original poster was raq
HTH  _________________ Linux, because I'd rather own a free OS than steal one that's not worth paying for.
Gentoo because I'm a masochist
AthlonXP-M on A7N8X. Portage ~x86 |
|
| Back to top |
|
 |
Monstros Tux's lil' helper


Joined: 07 Jul 2004 Posts: 111
|
Posted: Thu Apr 07, 2005 7:33 pm Post subject: |
|
|
you should read https://forums.gentoo.org/viewtopic-t-319349.html : Bob P explains how to install a Stage 1 from a stage 3 with the last gcc _________________ Monstros Velu - Nioub
- Core 2 Duo E6600, eVGA n680i, 2Go DDR2 PC2-8500, 8800GTS 640Mo, 2x320Go SATA HD
- Fujitsu-Siemens M3438G 75005, Pentium M 750, 1Go DDR2, 2x80Go HD, 6800GO 256Mo, 17" 1440x900 |
|
| Back to top |
|
 |
Gentree Watchman


Joined: 01 Jul 2003 Posts: 5350 Location: France, Old Europe
|
|
| Back to top |
|
 |
Gentree Watchman


Joined: 01 Jul 2003 Posts: 5350 Location: France, Old Europe
|
Posted: Fri Apr 08, 2005 1:19 am Post subject: |
|
|
| thechris wrote: | | Search for the long acovea thread. i have some findings in there. | acovea is misleading since it uses very old test code. It is an example of evolutionary algorithms not a serious tuning tool - as that author says.
| thechris wrote: | | possibly try different IO and CPU schedulers | I find nitro-sources give very responsive desktop system - they are based on the mighty ck-sources
| thechris wrote: | | some people used to renice X. not sure if that is a good idea though. | NO, that is no longer valid , its now negative.
| thechris wrote: | | -falign-functions=64 -- in my testings, i've seen the values of 4 or 16 come out better. try 4 |
That seems a little surprising on an athlon-xp, are you basing that on acover results or have you tested it by more realistic means?
That's about the only non-std flag I use, I'd be interested you can show it has a neg. impact.
Thx. _________________ Linux, because I'd rather own a free OS than steal one that's not worth paying for.
Gentoo because I'm a masochist
AthlonXP-M on A7N8X. Portage ~x86 |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|