Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Discussion & Documentation Documentation, Tips & Tricks
  • Search

gcc optimize for p3, p4 & xp

Unofficial documentation for various parts of Gentoo Linux. Note: This is not a support forum.
Post Reply
Advanced search
16 posts • Page 1 of 1
Author
Message
krinn
Watchman
Watchman
User avatar
Posts: 7476
Joined: Fri May 02, 2003 6:14 am

gcc optimize for p3, p4 & xp

  • Quote

Post by krinn » Sun Nov 21, 2004 7:12 am

Just find that, looks cool !

from gcc man pages...

Code: Select all

`-mfpmath=UNIT'
     Generate floating point arithmetics for selected unit UNIT.  The
     choices for UNIT are:

    `387'
          Use the standard 387 floating point coprocessor present
          majority of chips and emulated otherwise.  Code compiled with
          this option will run almost everywhere.  The temporary
          results are computed in 80bit precision instead of precision
          specified by the type resulting in slightly different results
          compared to most of other chips. See `-ffloat-store' for more
          detailed description.

          This is the default choice for i386 compiler.

    `sse'
          Use scalar floating point instructions present in the SSE
          instruction set.  This instruction set is supported by
          Pentium3 and newer chips, in the AMD line by Athlon-4,
          Athlon-xp and Athlon-mp chips.  The earlier version of SSE
          instruction set supports only single precision arithmetics,
          thus the double and extended precision arithmetics is still
          instruction set supports only single precision arithmetics,
          thus the double and extended precision arithmetics is still
          done using 387.  Later version, present only in Pentium4 and
          the future AMD x86-64 chips supports double precision
          arithmetics too.

          For i387 you need to use `-march=CPU-TYPE', `-msse' or
          `-msse2' switches to enable SSE extensions and make this
          option effective.  For x86-64 compiler, these extensions are
          enabled by default.

          [b]The resulting code should be considerably faster in the
          majority of cases[/b] and avoid the numerical instability
          problems of 387 code, but may break some existing code that
          expects temporaries to be 80bit.

          This is the default choice for the x86-64 compiler.
Got it ?
pentium3, pentium4, athlon users could use it :p

Code: Select all

CFLAGS="-march=pentium4 -mtune=pentium4 -O3 -pipe -msse2 -msse -mfpmath=sse -mmmx
It should be safe as it's the default choice for x86-64...
Top
krinn
Watchman
Watchman
User avatar
Posts: 7476
Joined: Fri May 02, 2003 6:14 am

  • Quote

Post by krinn » Sun Nov 21, 2004 7:18 am

Well i wasn't really sure i should post that one (looks more dangerous than the other) but, if you feel crazy enought

Code: Select all

    sse,387
          Attempt to utilize both instruction sets at once.  This
          effectively double the amount of available registers and on
          chips with separate execution units for 387 and SSE the
          execution resources too.  Use this option with care, as it is
          still experimental, because the GCC register allocator does
          not model separate functional units well resulting in
          instable performance.

Code: Select all

CFLAGS="-march=pentium4 -mtune=pentium4 -O3 -pipe -msse2 -msse -mfpmath=sse,387 -mmmx
And don't miss it--> Use this option with care, as it is still experimental
Top
frenkel
Veteran
Veteran
User avatar
Posts: 1034
Joined: Tue May 13, 2003 5:08 pm
Location: .nl
Contact:
Contact frenkel
Website

  • Quote

Post by frenkel » Sun Nov 21, 2004 10:57 am

I'm using this -mfpmath=sse,387 flag since I installed this system about a year ago (Athlon XP 2800+) and never had any problems with it. I use this system every day.

Frank
http://techfield.org
Top
Dolio
l33t
l33t
User avatar
Posts: 650
Joined: Mon Jun 17, 2002 8:24 am

  • Quote

Post by Dolio » Mon Nov 22, 2004 3:48 am

The only flag here that probably does anything is '-mfpmath=sse,387' and that only because it's experimental.

When you set '-march=whatever' it should automatically signal gcc t use '-msse -mmmx' etc. as appropriate to the architecture you specify. The only reason to use those flags is if you want to use -march=i386 and enable everything else manually or if something weird is going on with your cpu (like you have an Athlon Thunderbird that magically developed sse2 instructions :)).

Otherwise, it's either redundant (since it's already being specified by march) or potentially dangerous (since you could generate code that doesn't execute on your processor).
They don't have a good bathroom to do coke in.
Top
augury
l33t
l33t
User avatar
Posts: 722
Joined: Sat May 22, 2004 8:25 pm
Location: philadelphia

  • Quote

Post by augury » Mon Nov 22, 2004 6:59 am

-mfpmath=sse,387 doesnt do anything worth the effort

-msse3 on -march=prescott will have an effect if you use gcc-3.4.3,
devs took it out, i dont know why exactly, i think it gets to much when by default maybe or just broken.
Top
frenkel
Veteran
Veteran
User avatar
Posts: 1034
Joined: Tue May 13, 2003 5:08 pm
Location: .nl
Contact:
Contact frenkel
Website

  • Quote

Post by frenkel » Mon Nov 22, 2004 3:54 pm

augury wrote:-mfpmath=sse,387 doesnt do anything worth the effort
Where is this based on??

Frank
http://techfield.org
Top
rhill
Retired Dev
Retired Dev
User avatar
Posts: 1629
Joined: Fri Oct 22, 2004 9:58 am
Location: sk.ca

  • Quote

Post by rhill » Wed Dec 01, 2004 1:27 am

http://www.coyotegulch.com/products/aco ... ginal.html
http://www.coyotegulch.com/products/aco ... vea_4.html

i was also just browsing the gcc mailing list for reference to sse,387 sucking, and instead found an example to the contrary. in fact, for the P4, 'sse,387' > '387' > 'sse'. not right now (they were discussing a recent patch for gcc 4.0), but it's good to see that it's being looked at. :)

but i've heard a lot about how sse,387 doesn't work, is broken, or runs slower than the defaults. who knows, if it works for you, go for it. as with everything, it depends what you're running and what you're running it on.
by design, by neglect
for a fact or just for effect
Top
MighMoS
Guru
Guru
User avatar
Posts: 416
Joined: Thu Apr 24, 2003 2:20 pm
Location: @ ~
Contact:
Contact MighMoS
Website

  • Quote

Post by MighMoS » Wed Dec 01, 2004 2:14 am

This cut GNOME's startup time in half, as well as maploads for UT2k4 (I relinked the libs)
jabber: MighMoS@jabber.org

localhost # export HOME=`which heart`
Top
opm8
n00b
n00b
Posts: 56
Joined: Wed Sep 10, 2003 4:57 am

  • Quote

Post by opm8 » Wed Dec 01, 2004 7:13 am

MighMoS,

What's the command to relink libs?
MighMoS wrote:This cut GNOME's startup time in half, as well as maploads for UT2k4 (I relinked the libs)
Top
ARC2300
Apprentice
Apprentice
User avatar
Posts: 267
Joined: Sun Mar 30, 2003 6:18 am

  • Quote

Post by ARC2300 » Sun Dec 05, 2004 10:53 am

opm8 wrote:MighMoS,

What's the command to relink libs?
MighMoS wrote:This cut GNOME's startup time in half, as well as maploads for UT2k4 (I relinked the libs)
I believe you're looking for "ldconfig".
It's fun to take a trip
Put acid in your veins
Top
yngwin
Retired Dev
Retired Dev
User avatar
Posts: 4572
Joined: Thu Dec 19, 2002 1:22 pm
Location: Suzhou, China

  • Quote

Post by yngwin » Mon Dec 06, 2004 10:01 am

Actually on athlon-xp -mfpmath=387 is faster than the other options...
"Those who deny freedom to others deserve it not for themselves." - Abraham Lincoln
Free Culture | Defective by Design | EFF
Top
thechris
Veteran
Veteran
Posts: 1203
Joined: Sun Oct 12, 2003 1:02 am

  • Quote

Post by thechris » Mon Dec 06, 2004 6:13 pm

in every test i've done and every one i've seen, -mfpmath=anything will be worse then omitting the option. I can only assume the compiler can determine these things better. in the future 387,sse should be faster.
Top
Genkaku
n00b
n00b
Posts: 72
Joined: Thu Aug 26, 2004 4:31 pm
Location: Poland

  • Quote

Post by Genkaku » Mon Dec 06, 2004 6:59 pm

MighMoS, what cpu do you have ? And You have chosen -mfpmath=387, -mfpmath=sse or -mfpmath=sse,387 ?
Top
krinn
Watchman
Watchman
User avatar
Posts: 7476
Joined: Fri May 02, 2003 6:14 am

  • Quote

Post by krinn » Wed Dec 08, 2004 11:27 pm

ok, after few days testing mfpmath=sse,387 i could say

- Speed: well, can't really see the difference as i haven't tune it yet, except maybe gnome, looks to respond faster, but could be psychologic result... and loading seems really better...
- Stability: actually no problem with binary, no crash... code is stable for me...

augury: i'm aware of flags for prescott, nocona, sse3... but i open that thread for the mfpmath that i wasn't knowing, everyone talk about others, but actually never saw a thread with that one.
Maybe a lot of ppl knows it, but as nobody write it down, i didn't get that one, until now...

dirtyepic: both links are dead, could you drop some others ?

dolio: yep, but 1/ redundant isn't dangerous (my gcc like it), and 2/ mtune will automatically set them, not march.
ie: -march=pentium4 -msse3 == -march=pentium4 -mtune=prescott
So if you only set march=pentium4 and got a prescott, you will not have sse3 code until mtune or msse3 specified... As you see, march gives general architecture optimization, but you need to tune to your processor implementation.

Anyone got a real testcase with "time" ?
ps: should be a program that will help gcc produce code for sse,387... some equations maybe. and result should fail a "diff nonoptimizedversion optimizedversion"
Top
bi3l
Apprentice
Apprentice
User avatar
Posts: 268
Joined: Thu Feb 06, 2003 10:37 am
Location: France

  • Quote

Post by bi3l » Wed Dec 08, 2004 11:49 pm

krinn wrote:dolio: yep, but 1/ redundant isn't dangerous (my gcc like it), and 2/ mtune will automatically set them, not march.
ie: -march=pentium4 -msse3 == -march=pentium4 -mtune=prescott
So if you only set march=pentium4 and got a prescott, you will not have sse3 code until mtune or msse3 specified... As you see, march gives general architecture optimization, but you need to tune to your processor implementation.
That's not exactly true as you can just set -march=prescott and according to the man page of gcc:
specifying -march=cpu-type implies -mtune=cpu-type.
Top
krinn
Watchman
Watchman
User avatar
Posts: 7476
Joined: Fri May 02, 2003 6:14 am

  • Quote

Post by krinn » Thu Dec 09, 2004 2:07 am

good catch :D
Top
Post Reply

16 posts • Page 1 of 1

Return to “Documentation, Tips & Tricks”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic