Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[gcc 3.4] AMD's Recommended CFLAGS
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3, 4, 5  Next  
Reply to topic    Gentoo Forums Forum Index Gentoo on AMD64
View previous topic :: View next topic  
Author Message
nxsty
Veteran
Veteran


Joined: 23 Jun 2004
Posts: 1556
Location: .se

PostPosted: Fri Oct 21, 2005 9:28 am    Post subject: Reply with quote

revertex wrote:
@Joffer:

"--march=athlon64" doesn't include "mfpmath=sse"?

and why does ppl still add "-fomit-frame-pointer" to their cflags when use -O, -O2, -O3 or-Os?

I've found it in gcc onlinedocs page:

Quote:
-fomit-frame-pointer
Don't keep the frame pointer in a register for functions that don't need one. This avoids the instructions to save, set up and restore frame pointers; it also makes an extra register available in many functions. It also makes debugging impossible on some machines.

On some machines, such as the VAX, this flag has no effect, because the standard calling sequence automatically handles the frame pointer and nothing is saved by pretending it doesn't exist. The machine-description macro FRAME_POINTER_REQUIRED controls whether a target machine supports this flag. See Register Usage.

Enabled at levels -O, -O2, -O3, -Os.


it seems "-fomit-frame-pointer" is absolutely redundant if you use "-O?"


Yes, -fomit-frame-pointer is included in -O on amd64 so it's useless to have in CFLAGS. But there are other arches like x86 where it's not so thats probably why people have it. -march=athlon64 and friends doesn't include -mfpmath=sse but it's on by default on amd64 compilers.
Back to top
View user's profile Send private message
Gnufsh
Guru
Guru


Joined: 28 Dec 2002
Posts: 400
Location: Portland, OR

PostPosted: Fri Oct 21, 2005 1:13 pm    Post subject: Reply with quote

AFAIK -fomit-frame-frame-pointer is implied by -O, -O2, -O3, -Os when it does not break debugging. It breaks debugging on x86, so you have to include it manually there. I think that habit has just spilled over into amd64, where it doesn't break debugging, and you don't need to specify it seperately.
Back to top
View user's profile Send private message
sirdilznik
l33t
l33t


Joined: 28 Apr 2005
Posts: 731

PostPosted: Fri Oct 21, 2005 6:33 pm    Post subject: Reply with quote

I use CFLAGS="-march=k8 -O3 -pipe"
I figure keep it simple. Zero stability problems and my system is pretty fast :D
Back to top
View user's profile Send private message
Gnufsh
Guru
Guru


Joined: 28 Dec 2002
Posts: 400
Location: Portland, OR

PostPosted: Fri Oct 21, 2005 11:39 pm    Post subject: Reply with quote

crazycat wrote:
I don't know exactly about -pipe but i think it prevents gcc to use temporary files and pipes instead.

Correct. THis should speed up compilation, without changing the code that gets produced.
Back to top
View user's profile Send private message
revertex
l33t
l33t


Joined: 23 Apr 2003
Posts: 806

PostPosted: Sat Oct 22, 2005 2:35 am    Post subject: Reply with quote

Thank's for the enlightenment guys.

I have just one more question, i recently changed my mobo to a not so new socket 754/athlon64, but still running x86.
I don't have plans to migrate to x86_64 before some things like flash be avaliable to 64.
Does these cflags seems sane to recompile my system?

Code:
"-O2 -march=athlon64 -fforce-addr -frename-registers -ftracer -fprefetch-loop-arrays -fomit-frame-pointer -pipe"
Back to top
View user's profile Send private message
3BEPb
n00b
n00b


Joined: 20 Jun 2005
Posts: 50

PostPosted: Mon Oct 24, 2005 8:03 pm    Post subject: Reply with quote

Recently builded from scratch with this flags:

#-----Exp-----
CFLAGS="-march=athlon64 -O3 -ffast-math -funroll-all-loops -fpeel-loops -ftracer -funswitch-loops -funit-at-a-time -pipe"
CHOST="x86_64-pc-linux-gnu"
CXXFLAGS="${CFLAGS}"

#Wl,--relax will break glibc
#You may even consider using LDFLAGS=" " when building these packages.
# gstreamer
# openoffice

LDFLAGS="-Wl,-O1 -Wl,--sort-common -z combreloc -Wl,--enable-new-dtags -Wl,--relax"

No problems, except X.org memory leak problem (possibly related to Nvidia driver).
Flags is taken from the AMD pdf file (recommendation).

P.S
Recently founded this thing in those document:

-Bsymbolic. If building a dynamic library, -Bsymbolic can generally improve the performance of
code. Using this option causes references to be bound to global objects when building shared
libraries.
I try to add -Wl,-Bsymbolic to the LDFLAGS , and wait for results.
Back to top
View user's profile Send private message
energyman76b
Advocate
Advocate


Joined: 26 Mar 2003
Posts: 2048
Location: Germany

PostPosted: Tue Oct 25, 2005 12:41 am    Post subject: Reply with quote

Hi,

the only reason why your system is not horrible broken:

-ffast-math is filtered out by almost all ebuild.

Thanks to guys like you, tons of ebuilds have filter- or strip-flags included.

Adding -ffast-math is just stupid.

And the pdf at AMDs site talks about single applications (not whole systems) and to the programmers of said applications (who know their code) and not endusers (who do not know the code).

-funit-at-a-time is not much smarter - it does not hurt, but it is part of O2. So it is superflous.

Maybe you should read the manpage of gcc.
_________________
Study finds stunning lack of racial, gender, and economic diversity among middle-class white males

I identify as a dirty penismensch.
Back to top
View user's profile Send private message
3BEPb
n00b
n00b


Joined: 20 Jun 2005
Posts: 50

PostPosted: Tue Oct 25, 2005 5:16 am    Post subject: Reply with quote

energyman76b wrote:
Hi,
the only reason why your system is not horrible broken:
-ffast-math is filtered out by almost all ebuild.

Thanks to guys like you, tons of ebuilds have filter- or strip-flags included.

Adding -ffast-math is just stupid.

And the pdf at AMDs site talks about single applications (not whole systems) and to the programmers of said applications (who know their code) and not endusers (who do not know the code).
-funit-at-a-time is not much smarter - it does not hurt, but it is part of O2. So it is superflous.
Maybe you should read the manpage of gcc.


-ffast-math - "-ffast-math will give good boosts for certain packages, but will probably break others" - it's from the first page, if my system is stable why not to use it? And it's filtered out from ebuilds where it can affect the build.
-funit-at-a-time - mistake, but little.

1. It's my system, and my choise.
2. Overclocking the hardware is same experimental process, like playing with CFLAGS, but why me and other people do it? Maybe it's just intresting?
3. Whe are all styding from our mistakes, don't ever say that mistakes are stupid (unless they are REALLY stupid).
Back to top
View user's profile Send private message
enderandrew
l33t
l33t


Joined: 25 Oct 2005
Posts: 731

PostPosted: Wed Oct 26, 2005 9:43 am    Post subject: Reply with quote

toofastforyahuh wrote:
Not entirely true.

http://www.anthrofox.org/code/mame/xmame64_bench88.html


Looking at these benchmarks in xmame, I noticed a few things.

funroll-loops got you a 1.91% increase on average.
funroll-all-loops got you a 2.42% increase on average.

I am fairly new to the community, and I've yet to build my first install of gentoo. I'm still reading all the docs. But if funroll-all-loops does in fact break things, I'm not sure it is worth a half percentage increase. However, if you do want to be a ricer, it does technically seem to beat funroll-loops. The next thing I notice is this:

funswitch-loops actually cost you -0.18% on average.

When you put in every crazy CFLAG at once, you only got a 2.21% increase which was less than simply doing funroll-all-loops by itself. It would appear the CFLAGS might even interfere with one another. Instead of stacking benefits, you seem to get worse performance from the multiple CFLAGS than from one alone.

When I build for the first time, I'll probably test a safe and simple set of CFLAGS, but also try building on a "ricer" setup like:

CFLAGS="-O3 -march=athlon64 -ffast-math -funroll-all-loops -fpeel-loops -ftracer -pipe"

I'm curious what my results will be.
_________________
Nihilism makes me smile.
Back to top
View user's profile Send private message
flipik
n00b
n00b


Joined: 23 Oct 2005
Posts: 2
Location: Bratislava, Slovakia

PostPosted: Wed Oct 26, 2005 11:51 am    Post subject: Reply with quote

I set CFLAGS to "-march=athlon64 -mtune=athlon64 -O3 -funroll-loops -fpeel-loops -ftracer -pipe" and did emerge -e world..
Everything seem to work just fine. :lol:
Back to top
View user's profile Send private message
3BEPb
n00b
n00b


Joined: 20 Jun 2005
Posts: 50

PostPosted: Wed Oct 26, 2005 3:14 pm    Post subject: Reply with quote

Finaly i stoped on this flags:

#-----Exp-----
CFLAGS="-march=athlon64 -mtune=athlon64 -O3 -ffast-math -funroll-all-loops -fpeel-loops -ftracer -funswitch-loops -pipe"
CHOST="x86_64-pc-linux-gnu"
CXXFLAGS="${CFLAGS}"

#Wl,--relax will break glibc
#You may even consider using LDFLAGS="" when building these packages.
# gstreamer
# openoffice

#LDFLAGS=""
LDFLAGS="-Wl,-O1 -Wl,--sort-common -z combreloc -Wl,--enable-new-dtags -Wl,--relax"

MAKEOPTS="-j2"

-------------------
System is stable, no compile problems (except glibc building with -Wl,-relax).
Seems that X.org finaly stoped memory eating. Average memory usage by X is 102MB, it's not very good, but after the day of uptime memory usage is stable.
Back to top
View user's profile Send private message
nxsty
Veteran
Veteran


Joined: 23 Jun 2004
Posts: 1556
Location: .se

PostPosted: Wed Oct 26, 2005 5:59 pm    Post subject: Reply with quote

Way too much!

Everbody should stick with the default flags and stop wasting their time. Overzealous flags like that will only degrade performance and likely cause breakage.
Back to top
View user's profile Send private message
energyman76b
Advocate
Advocate


Joined: 26 Mar 2003
Posts: 2048
Location: Germany

PostPosted: Wed Oct 26, 2005 8:35 pm    Post subject: Reply with quote

3BEPb wrote:
energyman76b wrote:
Hi,
the only reason why your system is not horrible broken:
-ffast-math is filtered out by almost all ebuild.

Thanks to guys like you, tons of ebuilds have filter- or strip-flags included.

Adding -ffast-math is just stupid.

And the pdf at AMDs site talks about single applications (not whole systems) and to the programmers of said applications (who know their code) and not endusers (who do not know the code).
-funit-at-a-time is not much smarter - it does not hurt, but it is part of O2. So it is superflous.
Maybe you should read the manpage of gcc.


-ffast-math - "-ffast-math will give good boosts for certain packages, but will probably break others" - it's from the first page, if my system is stable why not to use it? And it's filtered out from ebuilds where it can affect the build.
-funit-at-a-time - mistake, but little.

1. It's my system, and my choise.
2. Overclocking the hardware is same experimental process, like playing with CFLAGS, but why me and other people do it? Maybe it's just intresting?
3. Whe are all styding from our mistakes, don't ever say that mistakes are stupid (unless they are REALLY stupid).


1. the ebuilds have to cope with it, so you are loading additionaly work onto the devs, and you are depriving other users of their chances. Because of -ffast-math users like you, some ebuilds just strip ALL flags. Thank you very much.
2. No, overclocking voids your warranty and gives you random brokenness. Stupid CFLAGS just cost a lot of time - everybody. And give you certain brokenness.
3. So why not listen to the ones who already did the mistakes - back, when gentoo was 'new'? -ffast-math is bad. Why do you insists on not believing it?

oh and
4. every application that can savely use 'ffast-math' has this option already set in its makefile.
_________________
Study finds stunning lack of racial, gender, and economic diversity among middle-class white males

I identify as a dirty penismensch.
Back to top
View user's profile Send private message
energyman76b
Advocate
Advocate


Joined: 26 Mar 2003
Posts: 2048
Location: Germany

PostPosted: Wed Oct 26, 2005 8:37 pm    Post subject: Reply with quote

flipik wrote:
I set CFLAGS to "-march=athlon64 -mtune=athlon64 -O3 -funroll-loops -fpeel-loops -ftracer -pipe" and did emerge -e world..
Everything seem to work just fine. :lol:


yeah, but you do not need mtune, when you are using march.
Please read man gcc about mtune and march and why setting both is superflous and may give you unwanted results.
_________________
Study finds stunning lack of racial, gender, and economic diversity among middle-class white males

I identify as a dirty penismensch.
Back to top
View user's profile Send private message
stonie
Tux's lil' helper
Tux's lil' helper


Joined: 03 Jun 2003
Posts: 87
Location: S'Minga, Halleluja

PostPosted: Wed Oct 26, 2005 9:01 pm    Post subject: Reply with quote

there are so many post on cflags in this forum. without wanting to be offensive, but someone could get the impression that this is starting out to be a religion on both sides......
here is another example:
https://forums.gentoo.org/viewtopic-t-378077-postdays-0-postorder-asc-start-0.html

and to heat up the flaming:
i am using custom cflags ( and yes i am so stupid to use -ffast-math), but i didn't have a chance to bug anybody with compile problems yet ;)

my personal opinion: as long as you know it was you burning your own system - don't blame the others ;)
_________________
How could I know what I think before I realized what I said???

http://valid.x86-secret.com/show_oc?id=63327
Back to top
View user's profile Send private message
mudrii
l33t
l33t


Joined: 26 Jun 2003
Posts: 789
Location: Singapore

PostPosted: Thu Oct 27, 2005 8:23 am    Post subject: Reply with quote

energyman76b wrote:
3BEPb wrote:
energyman76b wrote:
Hi,
the only reason why your system is not horrible broken:
-ffast-math is filtered out by almost all ebuild.

Thanks to guys like you, tons of ebuilds have filter- or strip-flags included.

Adding -ffast-math is just stupid.

And the pdf at AMDs site talks about single applications (not whole systems) and to the programmers of said applications (who know their code) and not endusers (who do not know the code).
-funit-at-a-time is not much smarter - it does not hurt, but it is part of O2. So it is superflous.
Maybe you should read the manpage of gcc.


-ffast-math - "-ffast-math will give good boosts for certain packages, but will probably break others" - it's from the first page, if my system is stable why not to use it? And it's filtered out from ebuilds where it can affect the build.
-funit-at-a-time - mistake, but little.

1. It's my system, and my choise.
2. Overclocking the hardware is same experimental process, like playing with CFLAGS, but why me and other people do it? Maybe it's just intresting?
3. Whe are all styding from our mistakes, don't ever say that mistakes are stupid (unless they are REALLY stupid).


1. the ebuilds have to cope with it, so you are loading additionaly work onto the devs, and you are depriving other users of their chances. Because of -ffast-math users like you, some ebuilds just strip ALL flags. Thank you very much.
2. No, overclocking voids your warranty and gives you random brokenness. Stupid CFLAGS just cost a lot of time - everybody. And give you certain brokenness.
3. So why not listen to the ones who already did the mistakes - back, when gentoo was 'new'? -ffast-math is bad. Why do you insists on not believing it?

oh and
4. every application that can savely use 'ffast-math' has this option already set in its makefile.



Code:

Some folk may object to my use of -ffast-math — however, in numerous accuracy tests, -ffast-math produces code that is both faster and more accurate than code generated without it. Yes, -ffast-math has other aspects that make for interesting debate; however, such discussions belong in another article. If you don't use -ffast-math, you're ignoring many of your processor's most powerful features.


From Acovia dev check http://www.coyotegulch.com/reviews/gcc4/index.html
and for more nfo check http://www.coyotegulch.com/products/acovea/index.html
_________________
www.gentoo.ro
Back to top
View user's profile Send private message
nxsty
Veteran
Veteran


Joined: 23 Jun 2004
Posts: 1556
Location: .se

PostPosted: Thu Oct 27, 2005 9:38 am    Post subject: Reply with quote

mudrii wrote:
Code:

Some folk may object to my use of -ffast-math — however, in numerous accuracy tests, -ffast-math produces code that is both faster and more accurate than code generated without it. Yes, -ffast-math has other aspects that make for interesting debate; however, such discussions belong in another article. If you don't use -ffast-math, you're ignoring many of your processor's most powerful features.


From Acovia dev check http://www.coyotegulch.com/reviews/gcc4/index.html
and for more nfo check http://www.coyotegulch.com/products/acovea/index.html


That's sort of true. But -ffast-math still produces inaccurate code in most cases. You should only use it for specific programs where speed is the only thing important and accurcy is not. Adding it to the CFLAGS in make.conf is just plain dumb. And besides -ffast-math doesn't work very well with -mfpmath=sse which is the default and amd64 so the gains are very little on this arch while the negative effects are just as bad.
Back to top
View user's profile Send private message
mudrii
l33t
l33t


Joined: 26 Jun 2003
Posts: 789
Location: Singapore

PostPosted: Fri Oct 28, 2005 2:39 am    Post subject: Reply with quote

I am using -ffast-math more than 3 years and never had any problems with this CFLAG I did had problems with other CFLAGS but never with -ffast-math.
_________________
www.gentoo.ro
Back to top
View user's profile Send private message
energyman76b
Advocate
Advocate


Joined: 26 Mar 2003
Posts: 2048
Location: Germany

PostPosted: Fri Oct 28, 2005 4:40 pm    Post subject: Reply with quote

mudrii wrote:
I am using -ffast-math more than 3 years and never had any problems with this CFLAG I did had problems with other CFLAGS but never with -ffast-math.


please check your ebuilds.

Almost all filter ffast-math.

The other ones have it set in their Makefiles.

so there is no reason to add it to your CFLAGS.
_________________
Study finds stunning lack of racial, gender, and economic diversity among middle-class white males

I identify as a dirty penismensch.
Back to top
View user's profile Send private message
tnt
Veteran
Veteran


Joined: 27 Feb 2004
Posts: 1222

PostPosted: Fri Oct 28, 2005 5:27 pm    Post subject: Reply with quote

Are some of these flags dangerous or bad in any other way:

Code:
CFLAGS="-march=athlon64 -O2 -pipe -funroll-all-loops -funit-at-a-time -fpeel-loops -ftracer -funswitch-loops -msse3"


:?:
_________________
gentoo user
Back to top
View user's profile Send private message
6D7474
Tux's lil' helper
Tux's lil' helper


Joined: 08 Sep 2005
Posts: 135

PostPosted: Fri Oct 28, 2005 10:19 pm    Post subject: Reply with quote

i would reconsider the use of -funroll-all-loops - it will make your binaries much bigger and this can actually slow down your system...
Back to top
View user's profile Send private message
energyman76b
Advocate
Advocate


Joined: 26 Mar 2003
Posts: 2048
Location: Germany

PostPosted: Fri Oct 28, 2005 11:15 pm    Post subject: Reply with quote

yeah -funroll-all-loops is usually not smart.
from man gcc:
-funroll-all-loops
Unroll all loops, even if their number of iterations is uncertain when the loop is entered. This usually makes
programs run more slowly

-msse3 is only 'save' if you know for sure that your CPU supports it (Venice Amd64).
_________________
Study finds stunning lack of racial, gender, and economic diversity among middle-class white males

I identify as a dirty penismensch.
Back to top
View user's profile Send private message
tnt
Veteran
Veteran


Joined: 27 Feb 2004
Posts: 1222

PostPosted: Sat Oct 29, 2005 1:41 am    Post subject: Reply with quote

energyman76b wrote:
-msse3 is only 'save' if you know for sure that your CPU supports it (Venice Amd64).


It's Sempron 2800+ 'BX' Palermo core and it has 'PNI' flag so it should have SSE3.

Anyway, thank you for '-funroll-all-loops' tip - very usefull one!
_________________
gentoo user
Back to top
View user's profile Send private message
energyman76b
Advocate
Advocate


Joined: 26 Mar 2003
Posts: 2048
Location: Germany

PostPosted: Sat Oct 29, 2005 4:17 pm    Post subject: Reply with quote

tnt wrote:
energyman76b wrote:
-msse3 is only 'save' if you know for sure that your CPU supports it (Venice Amd64).


It's Sempron 2800+ 'BX' Palermo core and it has 'PNI' flag so it should have SSE3.

Anyway, thank you for '-funroll-all-loops' tip - very usefull one!


I read some weeks ago, that some CPUs report the PNI flag, without having SSE3.
Try to run this:
cat test_pni.c
#include <stdint.h>

uint8_t __attribute__((aligned(64))) current[64];
uint8_t previous[64];

int main()
{
int i;
uint64_t result;
uint32_t _eax, _ebx, _ecx, _edx;
uint8_t _cpuid[13];
uint32_t *_cpuid0 = (uint32_t*) _cpuid;
uint32_t *_cpuid1 = (uint32_t*) ( _cpuid + 4 );
uint32_t *_cpuid2 = (uint32_t*) ( _cpuid + 8 );
uint8_t *ptr0 = current;
uint8_t *ptr1 = previous;

__asm__ __volatile__ (
"cpuid\n"
: "=a" (_eax),
"=b" (*_cpuid0), "=d" (*_cpuid1), "=c" (*_cpuid2)
: "a" (0) );
_cpuid[12] = 0;
printf( "cpuid(0) returns %d (%s)\n", _eax, _cpuid );
__asm__ __volatile__ (
"cpuid\n"
: "=a" (_eax), "=b" (_ebx), "=c" (_ecx), "=d" (_edx)
: "a" (1) );
printf( "cpuid(1) returns %08x %08x %08x %08x\n",
_eax, _ebx, _ecx, _edx );
memset( current, 0xaa, 64 );
memset( previous, 0x55, 64 );
for( i = 0; i < 4; i ++ ) {
__asm__ __volatile__ (
"movdqa %0, %%xmm0\n"
"movdqu %1, %%xmm1\n"
"psadbw %%xmm1, %%xmm0\n"
"paddw %%xmm0, %%xmm2\n"
"haddps %%xmm2, %%xmm2\n"
"haddps %%xmm2, %%xmm2\n"
: : "m" (*ptr0),
"m" (*ptr1) : "xmm0", "xmm1", "xmm2" );
ptr0 += 16;
ptr1 += 16;
}
__asm__ __volatile__ (
"movq %%xmm2, %0\n"
: "=m" (result) );
printf( "Result is %llu\n", result );
}

save it as test_pni.c, compile and run it.
If it throws errors, you do not have sse3.
If not, you have SSE3 and everything is fine.
_________________
Study finds stunning lack of racial, gender, and economic diversity among middle-class white males

I identify as a dirty penismensch.
Back to top
View user's profile Send private message
tnt
Veteran
Veteran


Joined: 27 Feb 2004
Posts: 1222

PostPosted: Sat Oct 29, 2005 4:26 pm    Post subject: Reply with quote

Code:
[tnt@master ~]$ ./test.bin
cpuid(0) returns 1 (AuthenticAMD)
cpuid(1) returns 00020fc2 00000800 00000001 078bfbff
Result is 496498219533200
[tnt@master ~]$


Seems like it has SSE3. Thank you for this test - very useful !
_________________
gentoo user
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo on AMD64 All times are GMT
Goto page Previous  1, 2, 3, 4, 5  Next
Page 3 of 5

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum