Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Making full use of cpu registers in CFLAGS
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3, 4, 5, 6, 7  Next  
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks
View previous topic :: View next topic  
Author Message
Gnufsh
Guru
Guru


Joined: 28 Dec 2002
Posts: 400
Location: Portland, OR

PostPosted: Tue Apr 01, 2003 8:57 am    Post subject: Reply with quote

I know that with -march=athlon-xp, sse, 3dnow, and mmx are turned on, adding -msse, -m3dnow, and -mmmx automatically turns on -mno-sse, -mno-mmx, and -mno-3dnow. Note the gcc -march=athlon-xp -Q -v output of a test file:
Quote:

options passed: -lang-c -v -D__GNUC__=3 -D__GNUC_MINOR__=2
-D__GNUC_PATCHLEVEL__=2 -D__GXX_ABI_VERSION=102 -D__ELF__ -Dunix
-D__gnu_linux__ -Dlinux -D__ELF__ -D__unix__ -D__gnu_linux__ -D__linux__
-D__unix -D__linux -Asystem=posix -D__NO_INLINE__ -D__STDC_HOSTED__=1
-Acpu=i386 -Amachine=i386 -Di386 -D__i386 -D__i386__ -D__athlon
-D__athlon__ -D__athlon_sse__ -D__tune_athlon__ -D__tune_athlon_sse__
-D__SSE__ -D__MMX__ -D__3dNOW__ -D__3dNOW_A__ -march=athlon-xp
options enabled: -fpeephole -ffunction-cse -fkeep-static-consts
-fpcc-struct-return -fgcse-lm -fgcse-sm -fsched-interblock -fsched-spec
-fbranch-count-reg -fcommon -fgnu-linker -fargument-alias -fident
-fmath-errno -ftrapping-math -m80387 -mhard-float -mno-soft-float
-mieee-fp -mfp-ret-in-387 -mcpu=athlon-xp -march=athlon-xp

see the -D__SSE__ -D__MMX__ -D__3dNOW__ -D__3dNOW_A__? these are the macros that turn sse and such on. Now without the -march:
Quote:

options passed: -lang-c -v -D__GNUC__=3 -D__GNUC_MINOR__=2
-D__GNUC_PATCHLEVEL__=2 -D__GXX_ABI_VERSION=102 -D__ELF__ -Dunix
-D__gnu_linux__ -Dlinux -D__ELF__ -D__unix__ -D__gnu_linux__ -D__linux__
-D__unix -D__linux -Asystem=posix -D__NO_INLINE__ -D__STDC_HOSTED__=1
-Acpu=i386 -Amachine=i386 -Di386 -D__i386 -D__i386__ -D__tune_i686__
-D__tune_pentiumpro__
options enabled: -fpeephole -ffunction-cse -fkeep-static-consts
-fpcc-struct-return -fgcse-lm -fgcse-sm -fsched-interblock -fsched-spec
-fbranch-count-reg -fcommon -fgnu-linker -fargument-alias -fident
-fmath-errno -ftrapping-math -m80387 -mhard-float -mno-soft-float
-mieee-fp -mfp-ret-in-387 -mcpu=pentiumpro -march=i386

and gcc -Q -v -mmmx -msse -m3dnow:
Quote:

ptions passed: -lang-c -v -D__GNUC__=3 -D__GNUC_MINOR__=2
-D__GNUC_PATCHLEVEL__=2 -D__GXX_ABI_VERSION=102 -D__ELF__ -Dunix
-D__gnu_linux__ -Dlinux -D__ELF__ -D__unix__ -D__gnu_linux__ -D__linux__
-D__unix -D__linux -Asystem=posix -D__NO_INLINE__ -D__STDC_HOSTED__=1
-Acpu=i386 -Amachine=i386 -Di386 -D__i386 -D__i386__ -D__SSE__ -D__MMX__
-D__3dNOW__ -D__tune_i686__ -D__tune_pentiumpro__ -mmmx -msse -m3dnow
options enabled: -fpeephole -ffunction-cse -fkeep-static-consts
-fpcc-struct-return -fgcse-lm -fgcse-sm -fsched-interblock -fsched-spec
-fbranch-count-reg -fcommon -fgnu-linker -fargument-alias -fident
-fmath-errno -ftrapping-math -m80387 -mhard-float -mno-soft-float
-mieee-fp -mfp-ret-in-387 -mmmx -mno-mmx -m3dnow -mno-3dnow -msse -mno-sse
-mcpu=pentiumpro -march=i386

see how the -D stuff changes?
One more with -march and -msse, etc:
Quote:

options passed: -lang-c -v -D__GNUC__=3 -D__GNUC_MINOR__=2
-D__GNUC_PATCHLEVEL__=2 -D__GXX_ABI_VERSION=102 -D__ELF__ -Dunix
-D__gnu_linux__ -Dlinux -D__ELF__ -D__unix__ -D__gnu_linux__ -D__linux__
-D__unix -D__linux -Asystem=posix -D__NO_INLINE__ -D__STDC_HOSTED__=1
-Acpu=i386 -Amachine=i386 -Di386 -D__i386 -D__i386__ -D__athlon
-D__athlon__ -D__athlon_sse__ -D__tune_athlon__ -D__tune_athlon_sse__
-D__SSE__ -D__MMX__ -D__3dNOW__ -D__3dNOW_A__ -march=athlon-xp -mmmx -msse
-m3dnow
options enabled: -fpeephole -ffunction-cse -fkeep-static-consts
-fpcc-struct-return -fgcse-lm -fgcse-sm -fsched-interblock -fsched-spec
-fbranch-count-reg -fcommon -fgnu-linker -fargument-alias -fident
-fmath-errno -ftrapping-math -m80387 -mhard-float -mno-soft-float
-mieee-fp -mfp-ret-in-387 -mmmx -mno-mmx -m3dnow -mno-3dnow -msse -mno-sse
-mcpu=athlon-xp -march=athlon-xp

see the " -mmmx -mno-mmx -m3dnow -mno-3dnow -msse -mno-sse" in options enabled? mmx, sse, 3dnow and things are enabled by -march-athlin-xp already. And no athlon-xp supports sse2. Perhaps the 2400+ supports sse too (as in also...). I'm also not yet sure of the performance gains from using -mfpmath=sse,387. Anybody know of a good way to benchmark this?
Back to top
View user's profile Send private message
AlterEgo
Veteran
Veteran


Joined: 25 Apr 2002
Posts: 1619

PostPosted: Tue Apr 01, 2003 9:18 am    Post subject: Reply with quote

Gnufsh wrote:
Anybody know of a good way to benchmark this?

I've used freebench to test the effect of Cflags. www.freebench.org.
It offers six different benchmarks and a limited online comparison database.
Back to top
View user's profile Send private message
magnet
Guru
Guru


Joined: 16 Mar 2003
Posts: 582
Location: france

PostPosted: Tue Apr 01, 2003 9:19 am    Post subject: Reply with quote

I use the -mfpmath=sse,387 thinggy.
let's recompile the whole system, I'll post what will happend.
should I benchmark it before/after ? with glxgears maybe ?
_________________
every step aim at glory.
Back to top
View user's profile Send private message
magnet
Guru
Guru


Joined: 16 Mar 2003
Posts: 582
Location: france

PostPosted: Tue Apr 01, 2003 9:19 am    Post subject: Reply with quote

lol this is a anwser-before-a-question :)
thx. 8)
_________________
every step aim at glory.
Back to top
View user's profile Send private message
barlad
l33t
l33t


Joined: 22 Feb 2003
Posts: 673

PostPosted: Tue Apr 01, 2003 11:22 am    Post subject: Reply with quote

Yeah let us know how those p4 optimizations work out please! After having read all those threads about cpu flag I am quite confused and I am not sure wether I should re compile fully or not my system.

I have been using -march pentium4 -O3 so far but I saw there were a lot of other stuff available. Your benchmarks would help in taking a decision! ;)
Back to top
View user's profile Send private message
magnet
Guru
Guru


Joined: 16 Mar 2003
Posts: 582
Location: france

PostPosted: Tue Apr 01, 2003 11:33 am    Post subject: Reply with quote

-march=pentium4 is broken, search for it in the forums. :cry:
_________________
every step aim at glory.
Back to top
View user's profile Send private message
barlad
l33t
l33t


Joined: 22 Feb 2003
Posts: 673

PostPosted: Tue Apr 01, 2003 11:37 am    Post subject: Reply with quote

yeah I heard about it. I have not had the slightest problem with it though. I have that bug with overflow in python/php but it does not seem to have had any impact so far
.
That's why I am waiting on some tests to see if -march pentium4 is really different from -march pentium3 -mcpu pentium4. If everything works fine, I don't want to recompile everything and end up with reduced performances :)
Back to top
View user's profile Send private message
magnet
Guru
Guru


Joined: 16 Mar 2003
Posts: 582
Location: france

PostPosted: Tue Apr 01, 2003 11:39 am    Post subject: Reply with quote

I'm exactly in the same situation as you.
I 've read in some threads that using -march=pentium4 will be slower than using -march=pentium3 -mcpu=pentium4. :roll:
_________________
every step aim at glory.
Back to top
View user's profile Send private message
kappax
Apprentice
Apprentice


Joined: 30 Aug 2002
Posts: 273
Location: The Moon

PostPosted: Tue Apr 01, 2003 3:07 pm    Post subject: Reply with quote

I am still just a little confuzed why

-msse


would turn on -no-msse to me, if i take the time to tell it "-msse" it better damn well be using "-msse"


anyway

I compiled my kde with
-mmmx -msse -m3dnow and it seems faster
_________________
My Box
glxgears - 4083.400 FPS
OS: GNU/Linux
Distro: Gentoo
kernel: 2.6.0-test9-mm2
----------------------
vi makes me :wq in word pad :(
Back to top
View user's profile Send private message
kappax
Apprentice
Apprentice


Joined: 30 Aug 2002
Posts: 273
Location: The Moon

PostPosted: Tue Apr 01, 2003 3:09 pm    Post subject: Reply with quote

ooo one way to test this is to compile mplayer with the flags, and if it does not give its speal about mmx or sse when starting then it did disable mmx and sse
_________________
My Box
glxgears - 4083.400 FPS
OS: GNU/Linux
Distro: Gentoo
kernel: 2.6.0-test9-mm2
----------------------
vi makes me :wq in word pad :(
Back to top
View user's profile Send private message
kappax
Apprentice
Apprentice


Joined: 30 Aug 2002
Posts: 273
Location: The Moon

PostPosted: Tue Apr 01, 2003 3:13 pm    Post subject: Reply with quote

this is what i get


Code:

CPU: Advanced Micro Devices Athlon 4 PM Palomino/Athlon MP Multiprocessor/Athlon XP eXtreme Performance (Family: 6, Stepping: 2)
Detected cache-line size is 64 bytes
SSE supported but disabled
CPUflags:  MMX: 1 MMX2: 1 3DNow: 1 3DNow2: 1 SSE: 0 SSE2: 0
Compiled for x86 CPU with extensions: MMX MMX2 3DNow 3DNowEx


whith flags

Code:

CFLAGS="-O3 -march=athlon-xp -pipe -fomit-frame-pointer -ffast-math -mmmx -msse -m3dnow -O3 -mfpmath=sse,387 "


hahah i have -03 2 timnes! ahhH!!
_________________
My Box
glxgears - 4083.400 FPS
OS: GNU/Linux
Distro: Gentoo
kernel: 2.6.0-test9-mm2
----------------------
vi makes me :wq in word pad :(
Back to top
View user's profile Send private message
eradicator
Retired Dev
Retired Dev


Joined: 01 Apr 2003
Posts: 144
Location: Berkeley, CA

PostPosted: Tue Apr 01, 2003 6:59 pm    Post subject: Reply with quote

According to freehackers.org:

Quote:
-mmmx, -msse are implied by -march=pentium3
Back to top
View user's profile Send private message
Gnufsh
Guru
Guru


Joined: 28 Dec 2002
Posts: 400
Location: Portland, OR

PostPosted: Tue Apr 01, 2003 8:42 pm    Post subject: Reply with quote

Here's someresults from nbench compiled with different flags:
(At the bottom, I did three runs each of two different cflags settings. Note the disparity between the three runs with the same cflags)


-mmmx -mno-mmx -m3dnow -mno-3dnow -msse -mno-sse
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1584.2 : 40.63 : 13.34
STRING SORT : 106.84 : 47.74 : 7.39
BITFIELD : 3.9963e+08 : 68.55 : 14.32
FP EMULATION : 176.08 : 84.49 : 19.50
FOURIER : 18356 : 20.88 : 11.73
ASSIGNMENT : 26.736 : 101.74 : 26.39
IDEA : 3161.9 : 48.36 : 14.36
HUFFMAN : 1354.6 : 37.56 : 12.00
NEURAL NET : 33.081 : 53.14 : 22.35
LU DECOMPOSITION : 1085 : 56.21 : 40.59
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 57.491
FLOATING-POINT INDEX: 39.653
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.081
INTEGER INDEX : 14.549
FLOATING-POINT INDEX: 21.993
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.


-O2 -mcpu=i686 -pipe
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1112.5 : 28.53 : 9.37
STRING SORT : 115.67 : 51.68 : 8.00
BITFIELD : 2.95e+08 : 50.60 : 10.57
FP EMULATION : 69.649 : 33.42 : 7.71
FOURIER : 18412 : 20.94 : 11.76
ASSIGNMENT : 18.052 : 68.69 : 17.82
IDEA : 2114.1 : 32.33 : 9.60
HUFFMAN : 1126 : 31.23 : 9.97
NEURAL NET : 28.048 : 45.06 : 18.95
LU DECOMPOSITION : 919.2 : 47.62 : 34.39
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 40.310
FLOATING-POINT INDEX: 35.549
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 11.464
INTEGER INDEX : 9.120
FLOATING-POINT INDEX: 19.717
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.





-march=athlon-x[ -O2 -pipe
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1557.1 : 39.93 : 13.11
STRING SORT : 114.28 : 51.06 : 7.90
BITFIELD : 3.3042e+08 : 56.68 : 11.84
FP EMULATION : 82.92 : 39.79 : 9.18
FOURIER : 18349 : 20.87 : 11.72
ASSIGNMENT : 21.266 : 80.92 : 20.99
IDEA : 1981.6 : 30.31 : 9.00
HUFFMAN : 1209.8 : 33.55 : 10.71
NEURAL NET : 27.523 : 44.21 : 18.60
LU DECOMPOSITION : 1044.1 : 54.09 : 39.06
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 45.080
FLOATING-POINT INDEX: 36.816
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 12.523
INTEGER INDEX : 10.380
FLOATING-POINT INDEX: 20.419
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.





-march=athlon-xp -O2 -fomit-frame-pointer -pipe
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1593.6 : 40.87 : 13.42
STRING SORT : 114.76 : 51.28 : 7.94
BITFIELD : 3.3532e+08 : 57.52 : 12.01
FP EMULATION : 89.56 : 42.98 : 9.92
FOURIER : 18340 : 20.86 : 11.72
ASSIGNMENT : 20.759 : 78.99 : 20.49
IDEA : 2061.4 : 31.53 : 9.36
HUFFMAN : 1210.8 : 33.58 : 10.72
NEURAL NET : 25.778 : 41.41 : 17.42
LU DECOMPOSITION : 986.56 : 51.11 : 36.91
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 45.960
FLOATING-POINT INDEX: 35.341
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 12.501
INTEGER INDEX : 10.751
FLOATING-POINT INDEX: 19.601
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.





-march=athlon-xp -O3 -fomit-frame-pointer -pipe
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1596.5 : 40.94 : 13.45
STRING SORT : 115.52 : 51.62 : 7.99
BITFIELD : 3.3783e+08 : 57.95 : 12.10
FP EMULATION : 134.03 : 64.31 : 14.84
FOURIER : 18340 : 20.86 : 11.72
ASSIGNMENT : 20.782 : 79.08 : 20.51
IDEA : 3173.8 : 48.54 : 14.41
HUFFMAN : 1196.2 : 33.17 : 10.59
NEURAL NET : 26.129 : 41.97 : 17.66
LU DECOMPOSITION : 1019.4 : 52.81 : 38.13
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 51.816
FLOATING-POINT INDEX: 35.890
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 12.565
INTEGER INDEX : 13.211
FLOATING-POINT INDEX: 19.906
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.





-march=athlon-xp -O3 -fomit-frame-pointer -pipe -fprefetch-loop-arrays
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1581.8 : 40.57 : 13.32
STRING SORT : 114.72 : 51.26 : 7.93
BITFIELD : 3.2762e+08 : 56.20 : 11.74
FP EMULATION : 133.44 : 64.03 : 14.78
FOURIER : 18269 : 20.78 : 11.67
ASSIGNMENT : 20.813 : 79.20 : 20.54
IDEA : 3183.1 : 48.68 : 14.45
HUFFMAN : 1284.1 : 35.61 : 11.37
NEURAL NET : 26.179 : 42.05 : 17.69
LU DECOMPOSITION : 1069.7 : 55.41 : 40.01
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 51.994
FLOATING-POINT INDEX: 36.447
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 12.414
INTEGER INDEX : 13.411
FLOATING-POINT INDEX: 20.215
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.





-march=athlon-xp -O3 -fomit-frame-pointer -pipe -funroll-loops
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1605 : 41.16 : 13.52
STRING SORT : 107.4 : 47.99 : 7.43
BITFIELD : 3.9864e+08 : 68.38 : 14.28
FP EMULATION : 182.36 : 87.50 : 20.19
FOURIER : 18411 : 20.94 : 11.76
ASSIGNMENT : 26.576 : 101.13 : 26.23
IDEA : 3182.9 : 48.68 : 14.45
HUFFMAN : 1366.9 : 37.90 : 12.10
NEURAL NET : 33.054 : 53.10 : 22.34
LU DECOMPOSITION : 1006.6 : 52.15 : 37.66
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 57.990
FLOATING-POINT INDEX: 38.703
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.066
INTEGER INDEX : 14.782
FLOATING-POINT INDEX: 21.466
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.





-march=athlon-xp -O3 -fomit-frame-pointer -pipe -funroll-loops -finline-functions
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1601.9 : 41.08 : 13.49
STRING SORT : 107.2 : 47.90 : 7.41
BITFIELD : 4.0036e+08 : 68.68 : 14.34
FP EMULATION : 182.28 : 87.47 : 20.18
FOURIER : 18364 : 20.88 : 11.73
ASSIGNMENT : 26.587 : 101.17 : 26.24
IDEA : 3184.5 : 48.71 : 14.46
HUFFMAN : 1365.8 : 37.87 : 12.09
NEURAL NET : 33.027 : 53.06 : 22.32
LU DECOMPOSITION : 1091.2 : 56.53 : 40.82
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 57.992
FLOATING-POINT INDEX: 39.713
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.079
INTEGER INDEX : 14.773
FLOATING-POINT INDEX: 22.026
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.





-march=athlon-xp -O3 -fomit-frame-pointer -pipe -funroll-loops -finline-functions -mfpmath=sse
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1607.4 : 41.22 : 13.54
STRING SORT : 107.28 : 47.94 : 7.42
BITFIELD : 3.9954e+08 : 68.54 : 14.32
FP EMULATION : 182.48 : 87.56 : 20.21
FOURIER : 18364 : 20.88 : 11.73
ASSIGNMENT : 26.558 : 101.06 : 26.21
IDEA : 3182.9 : 48.68 : 14.45
HUFFMAN : 1366.3 : 37.89 : 12.10
NEURAL NET : 33.134 : 53.23 : 22.39
LU DECOMPOSITION : 1032.7 : 53.50 : 38.63
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 58.009
FLOATING-POINT INDEX: 39.032
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.068
INTEGER INDEX : 14.789
FLOATING-POINT INDEX: 21.649
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
]






-march=athlon-xp -O3 -fomit-frame-pointer -pipe -funroll-loops -finline-functions -mfpmath=sse,387
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1612.6 : 41.36 : 13.58
STRING SORT : 106.92 : 47.77 : 7.39
BITFIELD : 3.9792e+08 : 68.26 : 14.26
FP EMULATION : 182.28 : 87.47 : 20.18
FOURIER : 18380 : 20.90 : 11.74
ASSIGNMENT : 26.547 : 101.02 : 26.20
IDEA : 3184.4 : 48.70 : 14.46
HUFFMAN : 1366.3 : 37.89 : 12.10
NEURAL NET : 33.16 : 53.27 : 22.41
LU DECOMPOSITION : 1045.7 : 54.17 : 39.12
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 57.966
FLOATING-POINT INDEX: 39.217
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.031
INTEGER INDEX : 14.799
FLOATING-POINT INDEX: 21.751
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.




-march=athlon-xp -O3 -fomit-frame-pointer -pipe -funroll-loops -finline-functions -falign-functions
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1601.4 : 41.07 : 13.49
STRING SORT : 107.48 : 48.03 : 7.43
BITFIELD : 3.9724e+08 : 68.14 : 14.23
FP EMULATION : 182.28 : 87.47 : 20.18
FOURIER : 18396 : 20.92 : 11.75
ASSIGNMENT : 26.515 : 100.90 : 26.17
IDEA : 3175.1 : 48.56 : 14.42
HUFFMAN : 1360.4 : 37.72 : 12.05
NEURAL NET : 32.869 : 52.80 : 22.21
LU DECOMPOSITION : 1013.3 : 52.49 : 37.91
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 57.867
FLOATING-POINT INDEX: 38.704
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.042
INTEGER INDEX : 14.746
FLOATING-POINT INDEX: 21.467
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.






3 runs niced to -19
-march=athlon-xp -O3 -fomit-frame-pointer -pipe -funroll-loops -finline-functions
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1606.8 : 41.21 : 13.53
STRING SORT : 107.24 : 47.92 : 7.42
BITFIELD : 3.9925e+08 : 68.48 : 14.30
FP EMULATION : 182.88 : 87.75 : 20.25
FOURIER : 18404 : 20.93 : 11.76
ASSIGNMENT : 26.624 : 101.31 : 26.28
IDEA : 3191.1 : 48.81 : 14.49
HUFFMAN : 1368 : 37.93 : 12.11
NEURAL NET : 33.094 : 53.16 : 22.36
LU DECOMPOSITION : 1024.6 : 53.08 : 38.33
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 58.066
FLOATING-POINT INDEX: 38.943
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.074
INTEGER INDEX : 14.810
FLOATING-POINT INDEX: 21.599
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1606.4 : 41.20 : 13.53
STRING SORT : 107.8 : 48.17 : 7.46
BITFIELD : 3.983e+08 : 68.32 : 14.27
FP EMULATION : 183.04 : 87.83 : 20.27
FOURIER : 18411 : 20.94 : 11.76
ASSIGNMENT : 26.608 : 101.25 : 26.26
IDEA : 3191.1 : 48.81 : 14.49
HUFFMAN : 1368.5 : 37.95 : 12.12
NEURAL NET : 33.107 : 53.18 : 22.37
LU DECOMPOSITION : 1028.2 : 53.26 : 38.46
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 58.093
FLOATING-POINT INDEX: 38.998
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.085
INTEGER INDEX : 14.813
FLOATING-POINT INDEX: 21.630
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1597.6 : 40.97 : 13.46
STRING SORT : 107.4 : 47.99 : 7.43
BITFIELD : 3.9998e+08 : 68.61 : 14.33
FP EMULATION : 182.92 : 87.77 : 20.25
FOURIER : 18405 : 20.93 : 11.76
ASSIGNMENT : 26.603 : 101.23 : 26.26
IDEA : 3189.8 : 48.79 : 14.49
HUFFMAN : 1367.4 : 37.92 : 12.11
NEURAL NET : 33.107 : 53.18 : 22.37
LU DECOMPOSITION : 1021.6 : 52.92 : 38.22
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 58.035
FLOATING-POINT INDEX: 38.910
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.086
INTEGER INDEX : 14.786
FLOATING-POINT INDEX: 21.581
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.





3 runs niced to -19
-march=athlon-xp -O3 -fomit-frame-pointer -pipe -funroll-loops -finline-functions -mfpmath=sse,387

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1604.6 : 41.15 : 13.52
STRING SORT : 108.71 : 48.58 : 7.52
BITFIELD : 4.009e+08 : 68.77 : 14.36
FP EMULATION : 182.96 : 87.79 : 20.26
FOURIER : 18324 : 20.84 : 11.70
ASSIGNMENT : 26.685 : 101.54 : 26.34
IDEA : 3188.5 : 48.77 : 14.48
HUFFMAN : 1368 : 37.93 : 12.11
NEURAL NET : 33.227 : 53.38 : 22.45
LU DECOMPOSITION : 1030.6 : 53.39 : 38.55
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 58.219
FLOATING-POINT INDEX: 39.013
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.169
INTEGER INDEX : 14.803
FLOATING-POINT INDEX: 21.638
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1614.9 : 41.41 : 13.60
STRING SORT : 108.04 : 48.28 : 7.47
BITFIELD : 3.9851e+08 : 68.36 : 14.28
FP EMULATION : 182.96 : 87.79 : 20.26
FOURIER : 18364 : 20.88 : 11.73
ASSIGNMENT : 26.685 : 101.54 : 26.34
IDEA : 3189.8 : 48.79 : 14.49
HUFFMAN : 1368.5 : 37.95 : 12.12
NEURAL NET : 33.24 : 53.40 : 22.46
LU DECOMPOSITION : 1046.8 : 54.23 : 39.16
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 58.177
FLOATING-POINT INDEX: 39.250
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.111
INTEGER INDEX : 14.830
FLOATING-POINT INDEX: 21.770
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)

TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1613 : 41.37 : 13.59
STRING SORT : 107.44 : 48.01 : 7.43
BITFIELD : 4.0225e+08 : 69.00 : 14.41
FP EMULATION : 182.92 : 87.77 : 20.25
FOURIER : 18411 : 20.94 : 11.76
ASSIGNMENT : 26.635 : 101.35 : 26.29
IDEA : 3191.1 : 48.81 : 14.49
HUFFMAN : 1368.5 : 37.95 : 12.12
NEURAL NET : 33.227 : 53.38 : 22.45
LU DECOMPOSITION : 1092.4 : 56.59 : 40.86
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 58.185
FLOATING-POINT INDEX: 39.842
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
C compiler : 3.2.2
libc : unknown version
MEMORY INDEX : 14.120
INTEGER INDEX : 14.826
FLOATING-POINT INDEX: 22.098
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
Back to top
View user's profile Send private message
Malakin
Veteran
Veteran


Joined: 14 Apr 2002
Posts: 1692
Location: Victoria BC Canada

PostPosted: Wed Apr 02, 2003 1:34 am    Post subject: Reply with quote

Here is a simple test that proves -march=athlon-xp enables sse, mmx and 3dnow support.

What I've done is emerge "jpeg" with a bunch of different cflags testing libjpeg.so.62.0.0 to see if it's md5sum has changed, if it hasn't changed then gcc definitely didn't use the added cflag.

I'm using a base of "-march=athlon-xp -O2 -pipe" and adding to it.

Here are the results:
NO -msse
NO -mmmx
NO -m3dnow
YES -mfpmath=sse,387

I then tried all the flags at the same time, md5 was the same as just using -mfpath=sse,387 (as was expected).

This proves that sse, mmx and 3dnow support are all enabled with -march=athlon-xp.

So in the end if you're using -march=athlon-xp don't worry about all this other stuff cause it doesn't make any difference.

Comments on Gnufsh's testing:
-fprefetch and -falign-functions are enabled with -O2.
-finline-functions is enabled with -O3.

I doubt using -mfpmath=sse,387 makes any actual performance difference with anything, someone please prove me wrong.
Back to top
View user's profile Send private message
TheCoop
Veteran
Veteran


Joined: 15 Jun 2002
Posts: 1814
Location: Where you least expect it

PostPosted: Wed Apr 02, 2003 7:05 am    Post subject: Reply with quote

...
_________________
95% of all computer errors occur between chair and keyboard (TM)

"One World, One web, One program" - Microsoft Promo ad.
"Ein Volk, Ein Reich, Ein Führer" - Adolf Hitler

Change the world - move a rock
Back to top
View user's profile Send private message
Gnufsh
Guru
Guru


Joined: 28 Dec 2002
Posts: 400
Location: Portland, OR

PostPosted: Wed Apr 02, 2003 8:44 am    Post subject: Reply with quote

Malakin wrote:

Comments on Gnufsh's testing:
-fprefetch and -falign-functions are enabled with -O2.
-finline-functions is enabled with -O3.

Here's my gcc -Q -v output for -march=athlon-xp -O3 -fomit-frame-pointer on a test file:
options passed: -lang-c -v -D__GNUC__=3 -D__GNUC_MINOR__=2
-D__GNUC_PATCHLEVEL__=2 -D__GXX_ABI_VERSION=102 -D__ELF__ -Dunix
-D__gnu_linux__ -Dlinux -D__ELF__ -D__unix__ -D__gnu_linux__ -D__linux__
-D__unix -D__linux -Asystem=posix -D__OPTIMIZE__ -D__STDC_HOSTED__=1
-Acpu=i386 -Amachine=i386 -Di386 -D__i386 -D__i386__ -D__athlon
-D__athlon__ -D__athlon_sse__ -D__tune_athlon__ -D__tune_athlon_sse__
-D__SSE__ -D__MMX__ -D__3dNOW__ -D__3dNOW_A__ -march=athlon-xp -O3
-fomit-frame-pointer
options enabled: -fdefer-pop -fomit-frame-pointer -foptimize-sibling-calls
-fcse-follow-jumps -fcse-skip-blocks -fexpensive-optimizations
-fthread-jumps -fstrength-reduce -fpeephole -fforce-mem -ffunction-cse
-fkeep-static-consts -fcaller-saves -fpcc-struct-return -fgcse -fgcse-lm
-fgcse-sm -frerun-cse-after-loop -frerun-loop-opt
-fdelete-null-pointer-checks -fschedule-insns2 -fsched-interblock
-fsched-spec -fbranch-count-reg -freorder-blocks -frename-registers
-fcprop-registers -fcommon -fgnu-linker -fregmove -foptimize-register-move
-fargument-alias -fstrict-aliasing -fmerge-constants -fident -fpeephole2
-fguess-branch-probability -fmath-errno -ftrapping-math -m80387
-mhard-float -mno-soft-float -mieee-fp -mfp-ret-in-387 -mcpu=athlon-xp
-march=athlon-xp

It doesn't show -finline-functions, -fprefetch-loop-arrays, or -falign-functions. Also, so far using freebench I see differentt results by ading -fprefetch-loop-arrays on some of the tests.
Back to top
View user's profile Send private message
Malakin
Veteran
Veteran


Joined: 14 Apr 2002
Posts: 1692
Location: Victoria BC Canada

PostPosted: Thu Apr 03, 2003 3:09 am    Post subject: Reply with quote

Quote:
It doesn't show -finline-functions, -fprefetch-loop-arrays, or -falign-functions. Also, so far using freebench I see differentt results by ading -fprefetch-loop-arrays on some of the tests.
It's possible the manual is wrong.

http://gcc.gnu.org/onlinedocs/gcc-3.2.2/gcc/Optimize-Options.html#Optimize%20Options

Near the top there's this:
Quote:
-O3
Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions and -frename-registers options.

Scroll down the page a bit and you'll see this:
Quote:
The following options control specific optimizations. The -O2 option turns on all of these optimizations except -funroll-loops and -funroll-all-loops. On most machines, the -O option turns on the -fthread-jumps and -fdelayed-branch options, but specific machines may handle it differently


Ok I decided to test them out before posting this.
-falign-functions is included in -O2 and -finline-functions is included in -O3 but -fprefetch-loop-arrays isn't included in O2 so the manual is wrong on that one. I used the same test as I did for the other stuff.
Back to top
View user's profile Send private message
Gnufsh
Guru
Guru


Joined: 28 Dec 2002
Posts: 400
Location: Portland, OR

PostPosted: Thu Apr 03, 2003 5:33 pm    Post subject: Reply with quote

Thanks for chcking that one. I believe functions are aligned to 4 by default on x86. Right now, I've got -falign-functions=5. One of the freebench benchmarks does best with functions aligned to 64.
Back to top
View user's profile Send private message
wrc1944
Advocate
Advocate


Joined: 15 Aug 2002
Posts: 3432
Location: Gainesville, Florida

PostPosted: Thu Apr 03, 2003 7:48 pm    Post subject: Reply with quote

Hmmmm.... This gets more and more confusing.

According to man gcc, the -mcpu=athlon-xp flag is not redundant, and without
specifially using it, the compiler will not generate code which will not run
on the i386, even with the -march=athlon-xp flag included. Apparently, this
means that with only the -march=athlon-xp flag specified, gcc omits some
features specific to the athlon-xp cpu. I assume this would hold true for any
specific cpu.

I've also read other places that unless you specifically add the -msse, -mmmx, -m3dnow flags, they won't be included, which in a way seems to reflect what the man gcc info says. Apparently, even though we all have thought -march=athlon-xp implies all the other flags, without specifying them and the cpu individually, it doesn't generate code that won't run on the i386. At least that's my understanding.

I certainly don't know firsthand myself- I just try and understand what I'm reading, and draw logical conclusions.

wrc1944
_________________
Main box- AsRock x370 Gaming K4
Ryzen 7 3700x, 3.6GHz, 16GB GSkill Flare DDR4 3200mhz
Samsung SATA 1000GB, Radeon HD R7 350 2GB DDR5
OpenRC Gentoo ~amd64 plasma, glibc-2.36-r7, gcc-13.2.1_p20230304
kernel-6.7.2 USE=experimental python3_11
Back to top
View user's profile Send private message
TheCoop
Veteran
Veteran


Joined: 15 Jun 2002
Posts: 1814
Location: Where you least expect it

PostPosted: Thu Apr 03, 2003 9:09 pm    Post subject: Reply with quote

this is very confusing...

typical oss/gpl documentation...

I wonder if I should poke around the code and gcc irc channels and ask a few ppl?
_________________
95% of all computer errors occur between chair and keyboard (TM)

"One World, One web, One program" - Microsoft Promo ad.
"Ein Volk, Ein Reich, Ein Führer" - Adolf Hitler

Change the world - move a rock
Back to top
View user's profile Send private message
Gnufsh
Guru
Guru


Joined: 28 Dec 2002
Posts: 400
Location: Portland, OR

PostPosted: Thu Apr 03, 2003 9:10 pm    Post subject: Reply with quote

-march=athlon-xp generates code that uses mmx, sse, and 3dnow without any additional flags needing to be specified. Code with -march=athlon-xp will not run correctly on a machine without these instructions (I tried on a machine that didn't support sse... anything that used sse failed to work properly. Appearently my laptop doesn't support sse even though the processor (athln-xp thoroughbred) does. I think the chipset or bios is responsible). -mcpu=athlon-xp generates code that will run on a i386.
Back to top
View user's profile Send private message
kappax
Apprentice
Apprentice


Joined: 30 Aug 2002
Posts: 273
Location: The Moon

PostPosted: Sun Apr 06, 2003 6:24 pm    Post subject: Reply with quote

so end the end how do i get sse 3dnow and mmx on my XP1600+ ?
_________________
My Box
glxgears - 4083.400 FPS
OS: GNU/Linux
Distro: Gentoo
kernel: 2.6.0-test9-mm2
----------------------
vi makes me :wq in word pad :(
Back to top
View user's profile Send private message
Gnufsh
Guru
Guru


Joined: 28 Dec 2002
Posts: 400
Location: Portland, OR

PostPosted: Sun Apr 06, 2003 11:43 pm    Post subject: Reply with quote

-march=athlon-xp should be all you need. It enables the macros that generate sse, mmx, and 3dnow code.
Back to top
View user's profile Send private message
kappax
Apprentice
Apprentice


Joined: 30 Aug 2002
Posts: 273
Location: The Moon

PostPosted: Mon Apr 07, 2003 1:40 am    Post subject: Reply with quote

Gnufsh wrote:
-march=athlon-xp should be all you need. It enables the macros that generate sse, mmx, and 3dnow code.


so what is this going to do for me ?


Code:

CFLAGS="-march=athlon-xp -O3 -fomit-frame-pointer -pipe -ffast-math -fprefetch-loop-arrays -funroll-loops -finline-functions -falign-jumps=4 -falign-loops=4 -falign-functions=64  -fforce-addr -mmmx -msse -m3dnow -mfpmath=sse,387"

_________________
My Box
glxgears - 4083.400 FPS
OS: GNU/Linux
Distro: Gentoo
kernel: 2.6.0-test9-mm2
----------------------
vi makes me :wq in word pad :(
Back to top
View user's profile Send private message
pagal
n00b
n00b


Joined: 17 Feb 2003
Posts: 59

PostPosted: Mon Apr 07, 2003 4:34 am    Post subject: Reply with quote

Hi,
I'm going to do an emerge -e world and before that I thought I should optimize my CFLAGS as well as USE FLAGS...can anyone help?

--------------------------------------------------------------------------------------
cat /proc/cpuinfo

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 6
cpu MHz : 866.708
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 mmx fxsr sse
bogomips : 1730.15
--------------------------------------------------------------------------------------

USE="#USE="X gtk gnome -alsa"
USE="gnome -kde -qt arts -nls python perl oggvorbis opengl sdl -postgres jpeg png truetype xml xml2 dvd avi aalib mpeg encode fbcon mmx"

--------------------------------------------------------------------------------------

I use gnome and also use arts instead of alsa.
any help would be appreciated.

Thanks.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks All times are GMT
Goto page Previous  1, 2, 3, 4, 5, 6, 7  Next
Page 2 of 7

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum