Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Architectures & Platforms Gentoo on AMD64
  • Search

[gcc 3.4] AMD's Recommended CFLAGS

Have an x86-64 problem? Post here.
Locked
Advanced search
117 posts
  • Previous
  • 1
  • 2
  • 3
  • 4
  • 5
Author
Message
nxsty
Veteran
Veteran
User avatar
Posts: 1556
Joined: Wed Jun 23, 2004 7:00 pm
Location: .se
Contact:
Contact nxsty
Website

  • Quote

Post by nxsty » Sat Jan 14, 2006 9:55 pm

sigmalll: True, but the problem is that there is no magic CFLAGS that will make your system faster. So when choosing global CFLAGS they should always be "-O2 -march= -pipe" and perhaps also omitfp if you're on x86. It doesn't matter if -O3 is shown to be faster in some benchmarks, it shouldn't be used globaly anyway because 99% of the packages wont benefit from the higher optimizations and will in fact often run slower because of the extra bloat it causes.
Top
sigmalll
Guru
Guru
User avatar
Posts: 332
Joined: Sun Aug 24, 2003 7:22 pm
Contact:
Contact sigmalll
Website

  • Quote

Post by sigmalll » Sat Jan 14, 2006 11:12 pm

nxsty wrote:sigmalll: True, but the problem is that there is no magic CFLAGS that will make your system faster. So when choosing global CFLAGS they should always be "-O2 -march= -pipe" and perhaps also omitfp if you're on x86. It doesn't matter if -O3 is shown to be faster in some benchmarks, it shouldn't be used globaly anyway because 99% of the packages wont benefit from the higher optimizations and will in fact often run slower because of the extra bloat it causes.
I understand the argument that -O3 -fblah flags are a bit pointless for a lot software, who gives a monkeys if 'less' is 7% faster (likewise, does 7% slower really matter either). But setting a high level of optimisation globally does guarantee that all applications that may benifit, do. I don't think anybody in their right mind would want to have optimisations on an application by application basis, especially if they're running a media intensive desktop. (and I don't expect the devs to start adding performance based flags to ebuilds)

But the compiler is where the most performnce gains can be obtained, and good CFLAGS really do make parts of your system run faster.

(I do have to add that this is only really an issue because GCC's focus in the past has always been portability rather than performance. In some cases this actually makes Linux applications slower than their windows counterparts)
Top
cybrjackle
Apprentice
Apprentice
User avatar
Posts: 248
Joined: Thu Jan 09, 2003 3:37 pm
Location: USA

  • Quote

Post by cybrjackle » Sun Jan 15, 2006 3:54 am

I use hardcore ricing flags

Code: Select all

# grep CFLAGS /etc/make.conf
CFLAGS="-O2 -march=k8 -pipe"
CXXFLAGS="${CFLAGS}"
:roll:
Top
LoSeR_5150
Guru
Guru
User avatar
Posts: 455
Joined: Sun Mar 20, 2005 9:19 pm
Location: San Francisco, CA

  • Quote

Post by LoSeR_5150 » Sun Jan 15, 2006 11:46 am

I am fairly new to gentoo, but within my 10mo. i have wasted tons of time playing with cflags, well i guess not wasted because i think i learned from the experience. My cflags used to look like this:

Code: Select all

CFLAGS="-march=athlon64 -O2 -ffast-math -fforce-addr -fmove-all-movables -fno-ident -fomit-frame-pointer -fpeel-loops -fprefetch-loop-arrays -frename-registers -ftracer -funrool-loops -funswitch-loops -fweb -pipe"


What i have learned is that while some app, say nbench, might see decent gains (i mean like 5-7% woo :roll: ) it makes just as many apps slower by the same percentage if not more resulting in an overall slower system. So it has been my experience the fewer ricey cflags the better. My current sys flags are:

Code: Select all

CFLAGS="-march=athlon64 -O2 -fforce-addr -fno-ident -ftracer -fweb -pipe" 
And i can say that my system runs much better (faster compile times, better app start times, better stability) now then when i had all my ricey cflags. I hate to say it but unless you are focusing on trying to speed up a certain app, maybe some type of media intensive app, it seems that it isnt worth the time to mess with your cflags extensively. Just my .02
Opteron 1356@2.4Ghz
6GB DDR2 800Mhz
128MB Quadro NVS 210S
640GB Western Digital HD
*Gentoo-x86_64-2.6.30-r1

Opteron175@2.2GHz
2GB DDR 400MHz
256MB Quadro 1400 Go
(2) 80GB Segate HDs: RAID0
*Gentoo-x86_64-2.6.30-r1
Top
MorLipf
Apprentice
Apprentice
User avatar
Posts: 226
Joined: Tue Nov 09, 2004 6:44 pm
Location: Solingen, Germany

  • Quote

Post by MorLipf » Sun Jan 15, 2006 11:51 am

My current Cflags are:

Code: Select all

CFLAGS="-march=k8 -O3 -pipe -fomit-frame-pointer"
Should I optimize them?
Top
nxsty
Veteran
Veteran
User avatar
Posts: 1556
Joined: Wed Jun 23, 2004 7:00 pm
Location: .se
Contact:
Contact nxsty
Website

  • Quote

Post by nxsty » Sun Jan 15, 2006 12:31 pm

sigmalll wrote:I understand the argument that -O3 -fblah flags are a bit pointless for a lot software, who gives a monkeys if 'less' is 7% faster (likewise, does 7% slower really matter either). But setting a high level of optimisation globally does guarantee that all applications that may benifit, do. I don't think anybody in their right mind would want to have optimisations on an application by application basis, especially if they're running a media intensive desktop. (and I don't expect the devs to start adding performance based flags to ebuilds)

But the compiler is where the most performnce gains can be obtained, and good CFLAGS really do make parts of your system run faster.

(I do have to add that this is only really an issue because GCC's focus in the past has always been portability rather than performance. In some cases this actually makes Linux applications slower than their windows counterparts)
There is always a tradeoff betwen speed and size when the compiler is optimizing. -O2 is usually a good balance, it turns on most optimizations but still doesn't bloat code much. Higher optimizations like -O3 -funroll-loops and friends has the side effect that they make binaries larger. Larger binaries means more disc reads, more memory usage, slower execution and larger chans of cache misses. This is acceptable for the specific applications that actually benefits from the higher optimizations but for most things it's just unnecessary bloat that is bad for performance. In fact -Os is usually the best options since most applications don't benefit from the extra optimizations but they do benefit from the smaller binary size. Compiler optimizations is only a small part of performance, most comes from good written code.
Top
robak
Apprentice
Apprentice
Posts: 209
Joined: Wed Jan 14, 2004 10:06 pm
Location: Germany

  • Quote

Post by robak » Wed Jan 18, 2006 4:41 am

i just tried a few CFLAGS to optimize POVray but the best result i could get is 28min 34 sec on this hardware:

AMD Athlon64 3000+ 1,8Ghz Venice-core
2*512 MB RAM in Dual-ChannelMode

can someone tell me how to optimmize the system to get better results.
i was compiling world for about 3 days now (i have "only" 135 packages to compile, so i could test a lot of FLAGS combinations) and i a little bit tied ;)
Top
mbar
Advocate
Advocate
User avatar
Posts: 2000
Joined: Wed Jan 19, 2005 9:45 am
Location: Poland

  • Quote

Post by mbar » Thu Jan 19, 2006 8:46 pm

or you could just overclock your CPU by mere 200 MHz and make it really fly faster, no ricer CFLAGS would do it instead

I settled on "-Os -march=k8 -pipe -fomit-frame-pointer -falign-functions=5"
Top
alexlm78
Veteran
Veteran
User avatar
Posts: 1265
Joined: Mon Dec 08, 2003 7:05 pm
Location: Guatemala,Guatemala
Contact:
Contact alexlm78
Website

  • Quote

Post by alexlm78 » Wed Feb 08, 2006 5:50 pm

Interesting, i should try it.!!!!!!! :twisted:
"This is a different kind of world, you need a different kind of software"

Linux User# 315201
100% Chapin hecho en Guatemala
Top
HacTek
n00b
n00b
User avatar
Posts: 7
Joined: Sun Jul 31, 2005 5:00 am
Location: New Zealand

  • Quote

Post by HacTek » Thu Feb 09, 2006 1:07 am

forgive my naivety, but is there a way to specify cflags on a package-by-package basis?
something similar to the way use flags can with the package.use file.

if not then why not?

seems like from this debate that a simple solution would be to set some safe and sensible cflags for the system.
perhaps

Code: Select all

 -O2 -march=XX -pipe
and then for a package which would benifit add say

Code: Select all

category/packagename -fwhatever-you-want
into a file called package.cflags

i reckon this could keep both sides of the fence happy.
you would get overall system stability and the ricers can have fun optimising an app without breaking other packages as easily.

might even take the preasure off the developers having to strip flags out of the ebuilds.

any reason why this wouldn't work?
SELECT * FROM Managers WHERE Clue > 0
0 Rows Returned
Top
barry
Apprentice
Apprentice
Posts: 170
Joined: Wed May 01, 2002 10:18 pm
Location: UK

  • Quote

Post by barry » Thu Feb 09, 2006 1:09 am

It's important to remember that a lot of these high performance optimisation flags are designed for developers to compile and test their software to get the best perfomance out of them. They were never intended for use to compile an entire production system blindly with.

As others have said, most of the packages that will receive massive perfomance gains with flags like -funroll-loops and -ffast-math already have these included in the ebuild so you don't need to enable them yourself.

"-march=k8 -O2 -pipe" should produce excellent optimised and rock solid code for everybody, and the builds like mplayer, lame and so on will already have safe higher performing flags applied.

"-march=k8 -O2 -pipe -frename-registers -fweb" is the same as doing -O3 but misses off a single flag that bloats code and quite often causes a slow down, so these flags will generally improve performance over a plain -O2 without breaking anything.
Top
SoTired
Apprentice
Apprentice
User avatar
Posts: 174
Joined: Wed May 19, 2004 12:52 am

  • Quote

Post by SoTired » Thu Feb 09, 2006 2:30 am

HacTek wrote:forgive my naivety, but is there a way to specify cflags on a package-by-package basis?
something similar to the way use flags can with the package.use file.
There is a way, it's just not officially endorsed, see http://forums.gentoo.org/viewtopic-t-28 ... art-0.html
Top
HacTek
n00b
n00b
User avatar
Posts: 7
Joined: Sun Jul 31, 2005 5:00 am
Location: New Zealand

  • Quote

Post by HacTek » Thu Feb 09, 2006 2:42 am

it looks like a pretty good script to me.
not that i have any experience with bash scripting but i gotta learn sometime.

is there any active development happening with this approach?
looks like a good candidate for moving towards an offically supported feature.
SELECT * FROM Managers WHERE Clue > 0
0 Rows Returned
Top
pacho2
Developer
Developer
User avatar
Posts: 2599
Joined: Fri Mar 04, 2005 7:53 pm
Location: Oviedo, Spain

  • Quote

Post by pacho2 » Thu Aug 31, 2006 5:05 pm

energyman76b wrote:
tnt wrote:
energyman76b wrote:-msse3 is only 'save' if you know for sure that your CPU supports it (Venice Amd64).
It's Sempron 2800+ 'BX' Palermo core and it has 'PNI' flag so it should have SSE3.

Anyway, thank you for '-funroll-all-loops' tip - very usefull one!
I read some weeks ago, that some CPUs report the PNI flag, without having SSE3.
Try to run this:
cat test_pni.c
#include <stdint.h>

uint8_t __attribute__((aligned(64))) current[64];
uint8_t previous[64];

int main()
{
int i;
uint64_t result;
uint32_t _eax, _ebx, _ecx, _edx;
uint8_t _cpuid[13];
uint32_t *_cpuid0 = (uint32_t*) _cpuid;
uint32_t *_cpuid1 = (uint32_t*) ( _cpuid + 4 );
uint32_t *_cpuid2 = (uint32_t*) ( _cpuid + 8 );
uint8_t *ptr0 = current;
uint8_t *ptr1 = previous;

__asm__ __volatile__ (
"cpuid\n"
: "=a" (_eax),
"=b" (*_cpuid0), "=d" (*_cpuid1), "=c" (*_cpuid2)
: "a" (0) );
_cpuid[12] = 0;
printf( "cpuid(0) returns %d (%s)\n", _eax, _cpuid );
__asm__ __volatile__ (
"cpuid\n"
: "=a" (_eax), "=b" (_ebx), "=c" (_ecx), "=d" (_edx)
: "a" (1) );
printf( "cpuid(1) returns %08x %08x %08x %08x\n",
_eax, _ebx, _ecx, _edx );
memset( current, 0xaa, 64 );
memset( previous, 0x55, 64 );
for( i = 0; i < 4; i ++ ) {
__asm__ __volatile__ (
"movdqa %0, %%xmm0\n"
"movdqu %1, %%xmm1\n"
"psadbw %%xmm1, %%xmm0\n"
"paddw %%xmm0, %%xmm2\n"
"haddps %%xmm2, %%xmm2\n"
"haddps %%xmm2, %%xmm2\n"
: : "m" (*ptr0),
"m" (*ptr1) : "xmm0", "xmm1", "xmm2" );
ptr0 += 16;
ptr1 += 16;
}
__asm__ __volatile__ (
"movq %%xmm2, %0\n"
: "=m" (result) );
printf( "Result is %llu\n", result );
}

save it as test_pni.c, compile and run it.
If it throws errors, you do not have sse3.
If not, you have SSE3 and everything is fine.
I have an Athlon 3200+ Winchester, I have compiled it and I get this output:

Code: Select all

./test_pni
cpuid(0) returns 1 (AuthenticAMD)
cpuid(1) returns 00020ff0 00000800 00000001 078bfbff
Result is 496498219533200
So, Does it support SSE3 :?: :?: 8O

Thanks for the information :)
Top
loftwyr
l33t
l33t
User avatar
Posts: 970
Joined: Wed Dec 29, 2004 2:51 am
Location: 43°38'23.62"N 79°27'8.60"W

  • Quote

Post by loftwyr » Thu Aug 31, 2006 5:46 pm

If you didn't have SSE3, it would have given an error instead of a result.

Your processor has SSE3
My emerge --info
Have you run revdep-rebuild lately? It's in gentoolkit and it's worth a shot if things don't work well.
Celebrating 5 years of Gentoo-ing.
Top
pacho2
Developer
Developer
User avatar
Posts: 2599
Joined: Fri Mar 04, 2005 7:53 pm
Location: Oviedo, Spain

  • Quote

Post by pacho2 » Thu Aug 31, 2006 6:26 pm

Thanks :)
Top
clytle374
Apprentice
Apprentice
Posts: 227
Joined: Tue Aug 01, 2006 3:07 am

  • Quote

Post by clytle374 » Fri Sep 01, 2006 5:38 am

I have decided that some here work for MS: either trying to make linux as slow as windows, or as unstable as windows. :P

Now i will have to break it to find out who. :lol:
Top
Locked

117 posts
  • Previous
  • 1
  • 2
  • 3
  • 4
  • 5

Return to “Gentoo on AMD64”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy