| View previous topic :: View next topic |
| Author |
Message |
robnotts Guru


Joined: 15 Mar 2004 Posts: 349 Location: Nottingham, UK
|
Posted: Sun Jul 03, 2005 11:02 am Post subject: Question - Who Is Using -ffast-math? |
|
|
Having been slapped down by one of the developers for reporting a bug where -ffast-math breaks a program... specifically games-sports/foobillard-3.0a and requesting that a new ebuild is generated with the -ffast-math flag filtered, I am curious to know how many people on AMD64 are using -ffast-math, and are they experiencing any problems, and do they notice any speed ups?
The reason I ask... I do a lot of video/sound work on this machine, and as most (now thankfully not all) software does not have AMD64 sse/sse2 optimisations I was using -ffast-math to gain some speed up. Apart from foobillard as listed above, I have not noticed any problems, and indeed seem to have a very stable machine.
Rob. _________________ ---
Gentoo Athlon64 X2-4200+ on NForce 4 + Geforce 7600GT 2GB/750GB (Desktop)
+ MythTV (3xFreeview,1xFreesat HD) on 1080p
+ Soundblaster Audigy
Gentoo Turion64 X2 Geforce 6150 2GB/120GB (Laptop) |
|
| Back to top |
|
 |
neuron Veteran


Joined: 28 May 2002 Posts: 2185
|
|
| Back to top |
|
 |
robnotts Guru


Joined: 15 Mar 2004 Posts: 349 Location: Nottingham, UK
|
Posted: Sun Jul 03, 2005 12:44 pm Post subject: |
|
|
An interesting article certainly... I think I'll stick with the -ffast-math, and just put into my /usr/local/portage directory anything that I need to change to filter out flags.
Rob. _________________ ---
Gentoo Athlon64 X2-4200+ on NForce 4 + Geforce 7600GT 2GB/750GB (Desktop)
+ MythTV (3xFreeview,1xFreesat HD) on 1080p
+ Soundblaster Audigy
Gentoo Turion64 X2 Geforce 6150 2GB/120GB (Laptop) |
|
| Back to top |
|
 |
crazycat l33t


Joined: 26 Aug 2003 Posts: 831 Location: Hamburg, Germany
|
Posted: Sun Jul 03, 2005 1:16 pm Post subject: |
|
|
A very interesting article on the topic: gentoo ont opteron with gcc 3.4.3
http://www.coyotegulch.com/products/acovea/aco5k8gcc34.html
You can see that -O2 is generally better than -O3 probably because -O3 uncludes -funroll-all-loops.
-funroll-all-loops looks like the worst optimization ever while -frename-registers and -fweb are quite usefull.
I also found that on -ffast-math while googling
| Quote: |
The gcc `man` page says that -ffast-math allows for ANSI and
IEEE rules to be violated. There is also a statement about
not using it in conjunction with -O options as this will result
in incorrect output.
So you get what you asked for.
The FPU has 80-bits of precision internally. Its state is
always saved and restored across context-switches. There
are no "extra bits of precision" as you state.
The FPU's state is not saved during system-calls so the
kernel is not supposed to use the FPU internally.
Look at <math.h> and the files it includes. Note that the
math library takes and returns type double. If you have
declared your floating-point variables as type float, you
will have serious dynamic rounding errors unless you
closely adhere to the IEEE spec. Even then, it might
be serious. If the IEEE spec gets violated by the
--fast-math, you might have discovered the reason why
you get strange values.
|
_________________ core2quad, mainboard: dfi lanparty ut icfx 3200-T2R/G, 2gb ram, nvidia-7300gs, samsung 1tb, 512gb and an old maxtor diamond max 200gb, sound: sb audigy. |
|
| Back to top |
|
 |
neuron Veteran


Joined: 28 May 2002 Posts: 2185
|
Posted: Sun Jul 03, 2005 1:45 pm Post subject: |
|
|
the threads that benchmark linked to was very interesting, I especially liked this post:
http://gcc.gnu.org/ml/gcc/2004-03/msg01459.html
If it's good enough for people who "make a very nice
living from selling software to solve finite-difference Poison-Boltzmann
electrostatic calculations on regular grids, and molecular minimizations
using quasi-newtonian numerical optimizers" then hell, it's gotta be good enough for me
in fact I'm gonna recompile with
| Code: |
CFLAGS="-march=athlon64 -O2 -msse3 -pipe -frename-registers -fweb -ffast-math"
CXXFLAGS="${CFLAGS} -fvisibility-inlines-hidden"
|
added rename-registers web and fast-math now. |
|
| Back to top |
|
 |
Earthwings Administrator


Joined: 14 Apr 2003 Posts: 7347 Location: Karlsruhe, Germany
|
Posted: Sun Jul 03, 2005 2:17 pm Post subject: |
|
|
Please don't create any bug reports if you're using it. _________________ KDE 4.1 - Get It While It's Hot! |
|
| Back to top |
|
 |
crazycat l33t


Joined: 26 Aug 2003 Posts: 831 Location: Hamburg, Germany
|
Posted: Sun Jul 03, 2005 3:06 pm Post subject: |
|
|
I once posted a bug to gentoos bugzilla along with a link to the patch that fixes the problem http://bugs.gentoo.org/show_bug.cgi?id=87149 ,and it's still not fixed (3 months already). Since then I'd rather check/post on creators website than report a bug to gentoo's bugzilla. _________________ core2quad, mainboard: dfi lanparty ut icfx 3200-T2R/G, 2gb ram, nvidia-7300gs, samsung 1tb, 512gb and an old maxtor diamond max 200gb, sound: sb audigy. |
|
| Back to top |
|
 |
Birtz Apprentice


Joined: 09 Feb 2005 Posts: 271 Location: Osijek / Croatia
|
Posted: Sun Jul 03, 2005 3:46 pm Post subject: |
|
|
Interesting read neuron, thank you. I myself don't use the -ffast-math flag because I never did much FP intensive stuff myself. If I did, I would consider testing the overall speed versus efficiency, and make the conclusion for myself. That said, I follow directions set by Gentoo developers (and "vast majority" of GCC team developers). This way I don't step on toes and "in most cases" my bug-reports or objections are rendered serious On the other hand, if the bug-report policy says that you shoud drop "those flags" and "test again" it is only few minutes more effort, no?
Regards _________________ It is not enough to have a good mind. The main thing is to use it well.
-- Rene Descartes
Don't have a childhood hero? How about Rob Hubbard http://www.freenetpages.co.uk/hp/tcworh/profile.htm |
|
| Back to top |
|
 |
thumper Guru


Joined: 05 Dec 2002 Posts: 303 Location: Venice FL
|
Posted: Sun Jul 03, 2005 3:58 pm Post subject: |
|
|
With the exception of a few packages that this breaks, I have been using this system wide without any obvious ill effects. (knock wood)
| Code: | | CFLAGS="-march=athlon64 -O2 -pipe -frename-registers -fweb -ffast-math -mfpmath=sse -ftracer -funroll-loops -fstack-protector " |
George |
|
| Back to top |
|
 |
robnotts Guru


Joined: 15 Mar 2004 Posts: 349 Location: Nottingham, UK
|
Posted: Sun Jul 03, 2005 4:10 pm Post subject: |
|
|
Well I must admit, I have no problem with taking out the -ffast-math flag for those packages that need it removing, the one so far, foobillard, but surely it only takes a few minutes effort to add the command into the ebuild, unless I miss understood something?
Anyway, never mind, I'll just carry on regardless... I seem to have very conservative CFLAGS with the exception of the -ffast-math flag anyway...
I'll get around to doing an emerge -Dvep world with something a bit more serious, I live the look of -fweb and -frename-registers, once I get the X2 processor and don't have to kill a day and a bit recompiling!
Rob. _________________ ---
Gentoo Athlon64 X2-4200+ on NForce 4 + Geforce 7600GT 2GB/750GB (Desktop)
+ MythTV (3xFreeview,1xFreesat HD) on 1080p
+ Soundblaster Audigy
Gentoo Turion64 X2 Geforce 6150 2GB/120GB (Laptop) |
|
| Back to top |
|
 |
Shapemaker n00b


Joined: 22 Aug 2004 Posts: 64 Location: Finland
|
Posted: Sun Jul 03, 2005 11:41 pm Post subject: |
|
|
| crazycat wrote: | A very interesting article on the topic: gentoo ont opteron with gcc 3.4.3
http://www.coyotegulch.com/products/acovea/aco5k8gcc34.html
You can see that -O2 is generally better than -O3 probably because -O3 uncludes -funroll-all-loops.
-funroll-all-loops looks like the worst optimization ever while -frename-registers and -fweb are quite usefull.
|
No, it does not include -funroll-loops. From http://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/Optimize-Options.html:
| Quote: | -O3
Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -fweb and -frename-registers options. |
You're likely thinking about -finline-functions, which can be very useful at times, specifically with all kinds of media encoders. _________________ "Intellectual Property" should be an affront to anyone capable of independent thought. |
|
| Back to top |
|
 |
Imago Apprentice

Joined: 25 Nov 2004 Posts: 157 Location: Germany
|
Posted: Mon Jul 04, 2005 4:48 am Post subject: |
|
|
| thumper wrote: | With the exception of a few packages that this breaks, I have been using this system wide without any obvious ill effects. (knock wood)
| Code: | | CFLAGS="-march=athlon64 -O2 -pipe -frename-registers -fweb -ffast-math -mfpmath=sse -ftracer -funroll-loops -fstack-protector " |
George |
-mfpmath=sse is the default choice for the x86-64 compiler, so you can leave that out
CU
Imago |
|
| Back to top |
|
 |
|