from gcc man pages...
Code: Select all
`-mfpmath=UNIT'
Generate floating point arithmetics for selected unit UNIT. The
choices for UNIT are:
`387'
Use the standard 387 floating point coprocessor present
majority of chips and emulated otherwise. Code compiled with
this option will run almost everywhere. The temporary
results are computed in 80bit precision instead of precision
specified by the type resulting in slightly different results
compared to most of other chips. See `-ffloat-store' for more
detailed description.
This is the default choice for i386 compiler.
`sse'
Use scalar floating point instructions present in the SSE
instruction set. This instruction set is supported by
Pentium3 and newer chips, in the AMD line by Athlon-4,
Athlon-xp and Athlon-mp chips. The earlier version of SSE
instruction set supports only single precision arithmetics,
thus the double and extended precision arithmetics is still
instruction set supports only single precision arithmetics,
thus the double and extended precision arithmetics is still
done using 387. Later version, present only in Pentium4 and
the future AMD x86-64 chips supports double precision
arithmetics too.
For i387 you need to use `-march=CPU-TYPE', `-msse' or
`-msse2' switches to enable SSE extensions and make this
option effective. For x86-64 compiler, these extensions are
enabled by default.
[b]The resulting code should be considerably faster in the
majority of cases[/b] and avoid the numerical instability
problems of 387 code, but may break some existing code that
expects temporaries to be 80bit.
This is the default choice for the x86-64 compiler.
pentium3, pentium4, athlon users could use it :p
Code: Select all
CFLAGS="-march=pentium4 -mtune=pentium4 -O3 -pipe -msse2 -msse -mfpmath=sse -mmmx






