| View previous topic :: View next topic |
| Author |
Message |
scmasaru n00b


Joined: 03 Dec 2002 Posts: 36
|
Posted: Sun Jun 01, 2003 8:46 pm Post subject: -mfpmath=sse,387 is dangerous |
|
|
I used -mfpmath=sse,387 to compile LAME, an mp3 encoder, and it turned out that it not only slowed the thing down but also produced different mp3 (using cmp).
I would suggest avoid this option like plague.
icc also has the similar problem. |
|
| Back to top |
|
 |
neuron Advocate


Joined: 28 May 2002 Posts: 2371
|
Posted: Sun Jun 01, 2003 8:51 pm Post subject: |
|
|
| hmmm... interesting /me 's always had it on. |
|
| Back to top |
|
 |
bsolar Bodhisattva


Joined: 12 Jan 2003 Posts: 2764
|
Posted: Sun Jun 01, 2003 9:04 pm Post subject: |
|
|
What are the whole CFLAGS and what's the CPU.
Also can you clarify what do you mean with "different" if possible? _________________ I may not agree with what you say, but I'll defend to the death your right to say it. |
|
| Back to top |
|
 |
Lovechild Advocate


Joined: 17 May 2002 Posts: 2858 Location: Århus, Denmark
|
Posted: Sun Jun 01, 2003 9:33 pm Post subject: |
|
|
actually what did you expect - you are telling your CPU to use another floating point unit, which might I add is less effective than the default 387 one.
Of course it will do some math different.
if you have an Athlon CPU do not use that flag, it won't speed up anything, rather it will slow things down. |
|
| Back to top |
|
 |
scmasaru n00b


Joined: 03 Dec 2002 Posts: 36
|
Posted: Sun Jun 01, 2003 11:39 pm Post subject: |
|
|
| bsolar wrote: | What are the whole CFLAGS and what's the CPU.
Also can you clarify what do you mean with "different" if possible? |
Here are the flags I tried.
1. -march=pentium3 -fprefetch-loop-arrays -fmerge-all-constants -fomit-frame-pointer -O3 -pipe
2. -march=pentium3 -O3 -pipe
3. -march=pentium3 -mfpmath=sse,387 -fomit-frame-pointer -O3 -pipe
4. -march=pentium3 -Os -pipe
The ones with -mfpmath=sse,387 made LAME behave differently from the others.
And I found that #2 was actually slightly faster than #1
Note that LAME ebuild stripped out -fomit-frame-pointer.
LAME compiled with icc also made lame behave differently from any of the above.
The command line for lame:
| Code: | [list]time lame --mp3input --alt-preset standard ep11.mp3 ep11-gcc-Os.mp3
time lame --mp3input --alt-preset standard ep11.mp3 ep11-gcc-O3-1.mp3
time lame --mp3input --alt-preset standard ep11.mp3 ep11-gcc-O3-2.mp3
time lame --mp3input --alt-preset standard ep11.mp3 ep11-icc.mp3[/list] |
It transcode an 20 min. long CBR mp3 file into another one using VBR.
I then used "cmp -l" to compile the resulting mp3 files. |
|
| Back to top |
|
 |
Megaptera Tux's lil' helper

Joined: 29 Jul 2002 Posts: 145 Location: Somerville, MA
|
Posted: Mon Jun 02, 2003 1:25 am Post subject: |
|
|
I've read where -mfpmath=sse,387 and Pentium 3s and 4s don't get along. Saw it in this thread , which you may find interesting. _________________ It is not like the world will end if I take the day off from eating worlds. |
|
| Back to top |
|
 |
Exner Tux's lil' helper


Joined: 08 Apr 2003 Posts: 128 Location: Melbourne, Australia
|
Posted: Mon Jun 02, 2003 7:26 am Post subject: Re: -mfpmath=sse,387 is dangerous |
|
|
| scmasaru wrote: | I used -mfpmath=sse,387 to compile LAME, an mp3 encoder, and it turned out that it not only slowed the thing down but also produced different mp3 (using cmp).
I would suggest avoid this option like plague.
icc also has the similar problem. |
The differences may only be slight, and not normally noticible. It does not mean the file produced is any less valid. In the end it comes down to whichever you prefer. _________________ - Exner (Antony Suter) |
|
| Back to top |
|
 |
zhenlin Veteran

Joined: 09 Nov 2002 Posts: 1361
|
Posted: Mon Jun 02, 2003 7:27 am Post subject: |
|
|
sse is for parallelized computations. 387 is for single floating point operations.
sse can be convinced to do the job of a FPU, but it's a waste of computing power.
Using sse,387 is a sure way to get into trouble. Such programs may try to access a 387 register when the data they are looking for is in a SSE register...
Are they audibly different? If not, does it matter? It is probably a small computational difference, like 0.00000000000000001 and 0.00000000000000002. |
|
| Back to top |
|
 |
Exner Tux's lil' helper


Joined: 08 Apr 2003 Posts: 128 Location: Melbourne, Australia
|
Posted: Mon Jun 02, 2003 7:31 am Post subject: |
|
|
| zhenlin wrote: | sse is for parallelized computations. 387 is for single floating point operations.
sse can be convinced to do the job of a FPU, but it's a waste of computing power.
Using sse,387 is a sure way to get into trouble. Such programs may try to access a 387 register when the data they are looking for is in a SSE register... |
It isn't a waste. Its a case of whichever works best for you.
It is not a "sure way to get into trouble". The compiler handles all the details. If you are putting hand coded assembly into your application then you also know what you are doing and when to use which flags. _________________ - Exner (Antony Suter) |
|
| Back to top |
|
 |
|