View previous topic :: View next topic |
Author |
Message |
nxsty Veteran
Joined: 23 Jun 2004 Posts: 1556 Location: .se
|
Posted: Fri Oct 21, 2005 9:28 am Post subject: |
|
|
revertex wrote: | @Joffer:
"--march=athlon64" doesn't include "mfpmath=sse"?
and why does ppl still add "-fomit-frame-pointer" to their cflags when use -O, -O2, -O3 or-Os?
I've found it in gcc onlinedocs page:
Quote: | -fomit-frame-pointer
Don't keep the frame pointer in a register for functions that don't need one. This avoids the instructions to save, set up and restore frame pointers; it also makes an extra register available in many functions. It also makes debugging impossible on some machines.
On some machines, such as the VAX, this flag has no effect, because the standard calling sequence automatically handles the frame pointer and nothing is saved by pretending it doesn't exist. The machine-description macro FRAME_POINTER_REQUIRED controls whether a target machine supports this flag. See Register Usage.
Enabled at levels -O, -O2, -O3, -Os. |
it seems "-fomit-frame-pointer" is absolutely redundant if you use "-O?" |
Yes, -fomit-frame-pointer is included in -O on amd64 so it's useless to have in CFLAGS. But there are other arches like x86 where it's not so thats probably why people have it. -march=athlon64 and friends doesn't include -mfpmath=sse but it's on by default on amd64 compilers. |
|
Back to top |
|
|
Gnufsh Guru
Joined: 28 Dec 2002 Posts: 400 Location: Portland, OR
|
Posted: Fri Oct 21, 2005 1:13 pm Post subject: |
|
|
AFAIK -fomit-frame-frame-pointer is implied by -O, -O2, -O3, -Os when it does not break debugging. It breaks debugging on x86, so you have to include it manually there. I think that habit has just spilled over into amd64, where it doesn't break debugging, and you don't need to specify it seperately. |
|
Back to top |
|
|
sirdilznik l33t
Joined: 28 Apr 2005 Posts: 731
|
Posted: Fri Oct 21, 2005 6:33 pm Post subject: |
|
|
I use CFLAGS="-march=k8 -O3 -pipe"
I figure keep it simple. Zero stability problems and my system is pretty fast |
|
Back to top |
|
|
Gnufsh Guru
Joined: 28 Dec 2002 Posts: 400 Location: Portland, OR
|
Posted: Fri Oct 21, 2005 11:39 pm Post subject: |
|
|
crazycat wrote: | I don't know exactly about -pipe but i think it prevents gcc to use temporary files and pipes instead. |
Correct. THis should speed up compilation, without changing the code that gets produced. |
|
Back to top |
|
|
revertex l33t
Joined: 23 Apr 2003 Posts: 806
|
Posted: Sat Oct 22, 2005 2:35 am Post subject: |
|
|
Thank's for the enlightenment guys.
I have just one more question, i recently changed my mobo to a not so new socket 754/athlon64, but still running x86.
I don't have plans to migrate to x86_64 before some things like flash be avaliable to 64.
Does these cflags seems sane to recompile my system?
Code: | "-O2 -march=athlon64 -fforce-addr -frename-registers -ftracer -fprefetch-loop-arrays -fomit-frame-pointer -pipe" |
|
|
Back to top |
|
|
3BEPb n00b
Joined: 20 Jun 2005 Posts: 50
|
Posted: Mon Oct 24, 2005 8:03 pm Post subject: |
|
|
Recently builded from scratch with this flags:
#-----Exp-----
CFLAGS="-march=athlon64 -O3 -ffast-math -funroll-all-loops -fpeel-loops -ftracer -funswitch-loops -funit-at-a-time -pipe"
CHOST="x86_64-pc-linux-gnu"
CXXFLAGS="${CFLAGS}"
#Wl,--relax will break glibc
#You may even consider using LDFLAGS=" " when building these packages.
# gstreamer
# openoffice
LDFLAGS="-Wl,-O1 -Wl,--sort-common -z combreloc -Wl,--enable-new-dtags -Wl,--relax"
No problems, except X.org memory leak problem (possibly related to Nvidia driver).
Flags is taken from the AMD pdf file (recommendation).
P.S
Recently founded this thing in those document:
-Bsymbolic. If building a dynamic library, -Bsymbolic can generally improve the performance of
code. Using this option causes references to be bound to global objects when building shared
libraries.
I try to add -Wl,-Bsymbolic to the LDFLAGS , and wait for results. |
|
Back to top |
|
|
energyman76b Advocate
Joined: 26 Mar 2003 Posts: 2048 Location: Germany
|
Posted: Tue Oct 25, 2005 12:41 am Post subject: |
|
|
Hi,
the only reason why your system is not horrible broken:
-ffast-math is filtered out by almost all ebuild.
Thanks to guys like you, tons of ebuilds have filter- or strip-flags included.
Adding -ffast-math is just stupid.
And the pdf at AMDs site talks about single applications (not whole systems) and to the programmers of said applications (who know their code) and not endusers (who do not know the code).
-funit-at-a-time is not much smarter - it does not hurt, but it is part of O2. So it is superflous.
Maybe you should read the manpage of gcc. _________________ Study finds stunning lack of racial, gender, and economic diversity among middle-class white males
I identify as a dirty penismensch. |
|
Back to top |
|
|
3BEPb n00b
Joined: 20 Jun 2005 Posts: 50
|
Posted: Tue Oct 25, 2005 5:16 am Post subject: |
|
|
energyman76b wrote: | Hi,
the only reason why your system is not horrible broken:
-ffast-math is filtered out by almost all ebuild.
Thanks to guys like you, tons of ebuilds have filter- or strip-flags included.
Adding -ffast-math is just stupid.
And the pdf at AMDs site talks about single applications (not whole systems) and to the programmers of said applications (who know their code) and not endusers (who do not know the code).
-funit-at-a-time is not much smarter - it does not hurt, but it is part of O2. So it is superflous.
Maybe you should read the manpage of gcc. |
-ffast-math - "-ffast-math will give good boosts for certain packages, but will probably break others" - it's from the first page, if my system is stable why not to use it? And it's filtered out from ebuilds where it can affect the build.
-funit-at-a-time - mistake, but little.
1. It's my system, and my choise.
2. Overclocking the hardware is same experimental process, like playing with CFLAGS, but why me and other people do it? Maybe it's just intresting?
3. Whe are all styding from our mistakes, don't ever say that mistakes are stupid (unless they are REALLY stupid). |
|
Back to top |
|
|
enderandrew l33t
Joined: 25 Oct 2005 Posts: 731
|
Posted: Wed Oct 26, 2005 9:43 am Post subject: |
|
|
Looking at these benchmarks in xmame, I noticed a few things.
funroll-loops got you a 1.91% increase on average.
funroll-all-loops got you a 2.42% increase on average.
I am fairly new to the community, and I've yet to build my first install of gentoo. I'm still reading all the docs. But if funroll-all-loops does in fact break things, I'm not sure it is worth a half percentage increase. However, if you do want to be a ricer, it does technically seem to beat funroll-loops. The next thing I notice is this:
funswitch-loops actually cost you -0.18% on average.
When you put in every crazy CFLAG at once, you only got a 2.21% increase which was less than simply doing funroll-all-loops by itself. It would appear the CFLAGS might even interfere with one another. Instead of stacking benefits, you seem to get worse performance from the multiple CFLAGS than from one alone.
When I build for the first time, I'll probably test a safe and simple set of CFLAGS, but also try building on a "ricer" setup like:
CFLAGS="-O3 -march=athlon64 -ffast-math -funroll-all-loops -fpeel-loops -ftracer -pipe"
I'm curious what my results will be. _________________ Nihilism makes me smile. |
|
Back to top |
|
|
flipik n00b
Joined: 23 Oct 2005 Posts: 2 Location: Bratislava, Slovakia
|
Posted: Wed Oct 26, 2005 11:51 am Post subject: |
|
|
I set CFLAGS to "-march=athlon64 -mtune=athlon64 -O3 -funroll-loops -fpeel-loops -ftracer -pipe" and did emerge -e world..
Everything seem to work just fine. |
|
Back to top |
|
|
3BEPb n00b
Joined: 20 Jun 2005 Posts: 50
|
Posted: Wed Oct 26, 2005 3:14 pm Post subject: |
|
|
Finaly i stoped on this flags:
#-----Exp-----
CFLAGS="-march=athlon64 -mtune=athlon64 -O3 -ffast-math -funroll-all-loops -fpeel-loops -ftracer -funswitch-loops -pipe"
CHOST="x86_64-pc-linux-gnu"
CXXFLAGS="${CFLAGS}"
#Wl,--relax will break glibc
#You may even consider using LDFLAGS="" when building these packages.
# gstreamer
# openoffice
#LDFLAGS=""
LDFLAGS="-Wl,-O1 -Wl,--sort-common -z combreloc -Wl,--enable-new-dtags -Wl,--relax"
MAKEOPTS="-j2"
-------------------
System is stable, no compile problems (except glibc building with -Wl,-relax).
Seems that X.org finaly stoped memory eating. Average memory usage by X is 102MB, it's not very good, but after the day of uptime memory usage is stable. |
|
Back to top |
|
|
nxsty Veteran
Joined: 23 Jun 2004 Posts: 1556 Location: .se
|
Posted: Wed Oct 26, 2005 5:59 pm Post subject: |
|
|
Way too much!
Everbody should stick with the default flags and stop wasting their time. Overzealous flags like that will only degrade performance and likely cause breakage. |
|
Back to top |
|
|
energyman76b Advocate
Joined: 26 Mar 2003 Posts: 2048 Location: Germany
|
Posted: Wed Oct 26, 2005 8:35 pm Post subject: |
|
|
3BEPb wrote: | energyman76b wrote: | Hi,
the only reason why your system is not horrible broken:
-ffast-math is filtered out by almost all ebuild.
Thanks to guys like you, tons of ebuilds have filter- or strip-flags included.
Adding -ffast-math is just stupid.
And the pdf at AMDs site talks about single applications (not whole systems) and to the programmers of said applications (who know their code) and not endusers (who do not know the code).
-funit-at-a-time is not much smarter - it does not hurt, but it is part of O2. So it is superflous.
Maybe you should read the manpage of gcc. |
-ffast-math - "-ffast-math will give good boosts for certain packages, but will probably break others" - it's from the first page, if my system is stable why not to use it? And it's filtered out from ebuilds where it can affect the build.
-funit-at-a-time - mistake, but little.
1. It's my system, and my choise.
2. Overclocking the hardware is same experimental process, like playing with CFLAGS, but why me and other people do it? Maybe it's just intresting?
3. Whe are all styding from our mistakes, don't ever say that mistakes are stupid (unless they are REALLY stupid). |
1. the ebuilds have to cope with it, so you are loading additionaly work onto the devs, and you are depriving other users of their chances. Because of -ffast-math users like you, some ebuilds just strip ALL flags. Thank you very much.
2. No, overclocking voids your warranty and gives you random brokenness. Stupid CFLAGS just cost a lot of time - everybody. And give you certain brokenness.
3. So why not listen to the ones who already did the mistakes - back, when gentoo was 'new'? -ffast-math is bad. Why do you insists on not believing it?
oh and
4. every application that can savely use 'ffast-math' has this option already set in its makefile. _________________ Study finds stunning lack of racial, gender, and economic diversity among middle-class white males
I identify as a dirty penismensch. |
|
Back to top |
|
|
energyman76b Advocate
Joined: 26 Mar 2003 Posts: 2048 Location: Germany
|
Posted: Wed Oct 26, 2005 8:37 pm Post subject: |
|
|
flipik wrote: | I set CFLAGS to "-march=athlon64 -mtune=athlon64 -O3 -funroll-loops -fpeel-loops -ftracer -pipe" and did emerge -e world..
Everything seem to work just fine. |
yeah, but you do not need mtune, when you are using march.
Please read man gcc about mtune and march and why setting both is superflous and may give you unwanted results. _________________ Study finds stunning lack of racial, gender, and economic diversity among middle-class white males
I identify as a dirty penismensch. |
|
Back to top |
|
|
stonie Tux's lil' helper
Joined: 03 Jun 2003 Posts: 87 Location: S'Minga, Halleluja
|
Posted: Wed Oct 26, 2005 9:01 pm Post subject: |
|
|
there are so many post on cflags in this forum. without wanting to be offensive, but someone could get the impression that this is starting out to be a religion on both sides......
here is another example:
https://forums.gentoo.org/viewtopic-t-378077-postdays-0-postorder-asc-start-0.html
and to heat up the flaming:
i am using custom cflags ( and yes i am so stupid to use -ffast-math), but i didn't have a chance to bug anybody with compile problems yet
my personal opinion: as long as you know it was you burning your own system - don't blame the others _________________ How could I know what I think before I realized what I said???
http://valid.x86-secret.com/show_oc?id=63327 |
|
Back to top |
|
|
mudrii l33t
Joined: 26 Jun 2003 Posts: 789 Location: Singapore
|
Posted: Thu Oct 27, 2005 8:23 am Post subject: |
|
|
energyman76b wrote: | 3BEPb wrote: | energyman76b wrote: | Hi,
the only reason why your system is not horrible broken:
-ffast-math is filtered out by almost all ebuild.
Thanks to guys like you, tons of ebuilds have filter- or strip-flags included.
Adding -ffast-math is just stupid.
And the pdf at AMDs site talks about single applications (not whole systems) and to the programmers of said applications (who know their code) and not endusers (who do not know the code).
-funit-at-a-time is not much smarter - it does not hurt, but it is part of O2. So it is superflous.
Maybe you should read the manpage of gcc. |
-ffast-math - "-ffast-math will give good boosts for certain packages, but will probably break others" - it's from the first page, if my system is stable why not to use it? And it's filtered out from ebuilds where it can affect the build.
-funit-at-a-time - mistake, but little.
1. It's my system, and my choise.
2. Overclocking the hardware is same experimental process, like playing with CFLAGS, but why me and other people do it? Maybe it's just intresting?
3. Whe are all styding from our mistakes, don't ever say that mistakes are stupid (unless they are REALLY stupid). |
1. the ebuilds have to cope with it, so you are loading additionaly work onto the devs, and you are depriving other users of their chances. Because of -ffast-math users like you, some ebuilds just strip ALL flags. Thank you very much.
2. No, overclocking voids your warranty and gives you random brokenness. Stupid CFLAGS just cost a lot of time - everybody. And give you certain brokenness.
3. So why not listen to the ones who already did the mistakes - back, when gentoo was 'new'? -ffast-math is bad. Why do you insists on not believing it?
oh and
4. every application that can savely use 'ffast-math' has this option already set in its makefile. |
Code: |
Some folk may object to my use of -ffast-math however, in numerous accuracy tests, -ffast-math produces code that is both faster and more accurate than code generated without it. Yes, -ffast-math has other aspects that make for interesting debate; however, such discussions belong in another article. If you don't use -ffast-math, you're ignoring many of your processor's most powerful features. |
From Acovia dev check http://www.coyotegulch.com/reviews/gcc4/index.html
and for more nfo check http://www.coyotegulch.com/products/acovea/index.html _________________ www.gentoo.ro |
|
Back to top |
|
|
nxsty Veteran
Joined: 23 Jun 2004 Posts: 1556 Location: .se
|
Posted: Thu Oct 27, 2005 9:38 am Post subject: |
|
|
mudrii wrote: | Code: |
Some folk may object to my use of -ffast-math however, in numerous accuracy tests, -ffast-math produces code that is both faster and more accurate than code generated without it. Yes, -ffast-math has other aspects that make for interesting debate; however, such discussions belong in another article. If you don't use -ffast-math, you're ignoring many of your processor's most powerful features. |
From Acovia dev check http://www.coyotegulch.com/reviews/gcc4/index.html
and for more nfo check http://www.coyotegulch.com/products/acovea/index.html |
That's sort of true. But -ffast-math still produces inaccurate code in most cases. You should only use it for specific programs where speed is the only thing important and accurcy is not. Adding it to the CFLAGS in make.conf is just plain dumb. And besides -ffast-math doesn't work very well with -mfpmath=sse which is the default and amd64 so the gains are very little on this arch while the negative effects are just as bad. |
|
Back to top |
|
|
mudrii l33t
Joined: 26 Jun 2003 Posts: 789 Location: Singapore
|
Posted: Fri Oct 28, 2005 2:39 am Post subject: |
|
|
I am using -ffast-math more than 3 years and never had any problems with this CFLAG I did had problems with other CFLAGS but never with -ffast-math. _________________ www.gentoo.ro |
|
Back to top |
|
|
energyman76b Advocate
Joined: 26 Mar 2003 Posts: 2048 Location: Germany
|
Posted: Fri Oct 28, 2005 4:40 pm Post subject: |
|
|
mudrii wrote: | I am using -ffast-math more than 3 years and never had any problems with this CFLAG I did had problems with other CFLAGS but never with -ffast-math. |
please check your ebuilds.
Almost all filter ffast-math.
The other ones have it set in their Makefiles.
so there is no reason to add it to your CFLAGS. _________________ Study finds stunning lack of racial, gender, and economic diversity among middle-class white males
I identify as a dirty penismensch. |
|
Back to top |
|
|
tnt Veteran
Joined: 27 Feb 2004 Posts: 1222
|
Posted: Fri Oct 28, 2005 5:27 pm Post subject: |
|
|
Are some of these flags dangerous or bad in any other way:
Code: | CFLAGS="-march=athlon64 -O2 -pipe -funroll-all-loops -funit-at-a-time -fpeel-loops -ftracer -funswitch-loops -msse3" |
_________________ gentoo user |
|
Back to top |
|
|
6D7474 Tux's lil' helper
Joined: 08 Sep 2005 Posts: 135
|
Posted: Fri Oct 28, 2005 10:19 pm Post subject: |
|
|
i would reconsider the use of -funroll-all-loops - it will make your binaries much bigger and this can actually slow down your system... |
|
Back to top |
|
|
energyman76b Advocate
Joined: 26 Mar 2003 Posts: 2048 Location: Germany
|
Posted: Fri Oct 28, 2005 11:15 pm Post subject: |
|
|
yeah -funroll-all-loops is usually not smart.
from man gcc:
-funroll-all-loops
Unroll all loops, even if their number of iterations is uncertain when the loop is entered. This usually makes
programs run more slowly
-msse3 is only 'save' if you know for sure that your CPU supports it (Venice Amd64). _________________ Study finds stunning lack of racial, gender, and economic diversity among middle-class white males
I identify as a dirty penismensch. |
|
Back to top |
|
|
tnt Veteran
Joined: 27 Feb 2004 Posts: 1222
|
Posted: Sat Oct 29, 2005 1:41 am Post subject: |
|
|
energyman76b wrote: | -msse3 is only 'save' if you know for sure that your CPU supports it (Venice Amd64). |
It's Sempron 2800+ 'BX' Palermo core and it has 'PNI' flag so it should have SSE3.
Anyway, thank you for '-funroll-all-loops' tip - very usefull one! _________________ gentoo user |
|
Back to top |
|
|
energyman76b Advocate
Joined: 26 Mar 2003 Posts: 2048 Location: Germany
|
Posted: Sat Oct 29, 2005 4:17 pm Post subject: |
|
|
tnt wrote: | energyman76b wrote: | -msse3 is only 'save' if you know for sure that your CPU supports it (Venice Amd64). |
It's Sempron 2800+ 'BX' Palermo core and it has 'PNI' flag so it should have SSE3.
Anyway, thank you for '-funroll-all-loops' tip - very usefull one! |
I read some weeks ago, that some CPUs report the PNI flag, without having SSE3.
Try to run this:
cat test_pni.c
#include <stdint.h>
uint8_t __attribute__((aligned(64))) current[64];
uint8_t previous[64];
int main()
{
int i;
uint64_t result;
uint32_t _eax, _ebx, _ecx, _edx;
uint8_t _cpuid[13];
uint32_t *_cpuid0 = (uint32_t*) _cpuid;
uint32_t *_cpuid1 = (uint32_t*) ( _cpuid + 4 );
uint32_t *_cpuid2 = (uint32_t*) ( _cpuid + 8 );
uint8_t *ptr0 = current;
uint8_t *ptr1 = previous;
__asm__ __volatile__ (
"cpuid\n"
: "=a" (_eax),
"=b" (*_cpuid0), "=d" (*_cpuid1), "=c" (*_cpuid2)
: "a" (0) );
_cpuid[12] = 0;
printf( "cpuid(0) returns %d (%s)\n", _eax, _cpuid );
__asm__ __volatile__ (
"cpuid\n"
: "=a" (_eax), "=b" (_ebx), "=c" (_ecx), "=d" (_edx)
: "a" (1) );
printf( "cpuid(1) returns %08x %08x %08x %08x\n",
_eax, _ebx, _ecx, _edx );
memset( current, 0xaa, 64 );
memset( previous, 0x55, 64 );
for( i = 0; i < 4; i ++ ) {
__asm__ __volatile__ (
"movdqa %0, %%xmm0\n"
"movdqu %1, %%xmm1\n"
"psadbw %%xmm1, %%xmm0\n"
"paddw %%xmm0, %%xmm2\n"
"haddps %%xmm2, %%xmm2\n"
"haddps %%xmm2, %%xmm2\n"
: : "m" (*ptr0),
"m" (*ptr1) : "xmm0", "xmm1", "xmm2" );
ptr0 += 16;
ptr1 += 16;
}
__asm__ __volatile__ (
"movq %%xmm2, %0\n"
: "=m" (result) );
printf( "Result is %llu\n", result );
}
save it as test_pni.c, compile and run it.
If it throws errors, you do not have sse3.
If not, you have SSE3 and everything is fine. _________________ Study finds stunning lack of racial, gender, and economic diversity among middle-class white males
I identify as a dirty penismensch. |
|
Back to top |
|
|
tnt Veteran
Joined: 27 Feb 2004 Posts: 1222
|
Posted: Sat Oct 29, 2005 4:26 pm Post subject: |
|
|
Code: | [tnt@master ~]$ ./test.bin
cpuid(0) returns 1 (AuthenticAMD)
cpuid(1) returns 00020fc2 00000800 00000001 078bfbff
Result is 496498219533200
[tnt@master ~]$ |
Seems like it has SSE3. Thank you for this test - very useful ! _________________ gentoo user |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|