View previous topic :: View next topic |
Author |
Message |
borism n00b
Joined: 24 Oct 2003 Posts: 13 Location: Australia
|
Posted: Fri Jun 24, 2005 3:05 pm Post subject: intel icc 8.1 compiler performs poorly in nbench on my PC |
|
|
I've got an AthlonXP "2000+". I recently installed icc8.1.032 and I am very disappointed. On my system it is slower than gcc3.4.4 (with glibc 2.3.5).
I untarred nbench manually so that I could play with the settings in the make file and run a few benchmarks.
I started off with compiling with "default" -O2, that produced a larger binary and faster code than my default gcc -O2. Then I read the man pages. It turns out that icc enables unsafe math optimisations and loop unrolling by default. Ouch. That explains the large binary. As a matter of fact, even with -Os intel binary is larger than gcc's with -O2. So in space-speed tradeoff (so much talked about in CFLAGS threads) gcc definitely holds its own.
So I decided to get the best out of both compilers as far as speed is concerned.
For Intel I enabled function inlining (-Ob2) and inter-procedure optimisations as well as generated code for P3 (to include SSE) and tune it to P4 (Athlon-XP generation). Loop unrolling, fast math and frame pointer omission are enabled by default.
For gcc I enabled fast maths, loop unrolling, and frame pointer omission. -O3 enables function inlining. To be fair to intel compiler I optimised for P3 and tuned to P4.
So, apples to apples as much as possible.
Here are the results:
Code: |
icc -O3 -Ob2 -ipo -xK -tpp7
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 913.12 : 23.42 : 7.69
STRING SORT : 98.84 : 44.16 : 6.84
BITFIELD : 3.3639e+08 : 57.70 : 12.05
FP EMULATION : 87.2 : 41.84 : 9.66
FOURIER : 15462 : 17.58 : 9.88
ASSIGNMENT : 15.434 : 58.73 : 15.23
IDEA : 3377.3 : 51.65 : 15.34
HUFFMAN : 1174.4 : 32.57 : 10.40
NEURAL NET : 24.463 : 39.30 : 16.53
LU DECOMPOSITION : 726.8 : 37.65 : 27.19
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 42.408
FLOATING-POINT INDEX: 29.631
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU : AuthenticAMD AMD Athlon(TM) XP 2000+ 1670MHz
L2 Cache : 256 KB
OS : Linux 2.6.11-gentoo-r8
C compiler : /opt/intel/compiler81/bin/icc
libc :
MEMORY INDEX : 10.787
INTEGER INDEX : 10.432
FLOATING-POINT INDEX: 16.435
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder. |
Code: |
gcc -O3 -fomit-frame-pointer -ffast-math -unroll-loops -march=pentium3 -mtune=pentium4
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 1433.6 : 36.77 : 12.07
STRING SORT : 106.4 : 47.54 : 7.36
BITFIELD : 3.5612e+08 : 61.09 : 12.76
FP EMULATION : 166.92 : 80.10 : 18.48
FOURIER : 16534 : 18.80 : 10.56
ASSIGNMENT : 24.193 : 92.06 : 23.88
IDEA : 2663.5 : 40.74 : 12.10
HUFFMAN : 1256.5 : 34.84 : 11.13
NEURAL NET : 29.644 : 47.62 : 20.03
LU DECOMPOSITION : 845.52 : 43.80 : 31.63
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 52.623
FLOATING-POINT INDEX: 33.976
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU : AuthenticAMD AMD Athlon(TM) XP 2000+ 1670MHz
L2 Cache : 256 KB
OS : Linux 2.6.11-gentoo-r8
MEMORY INDEX : 13.088
INTEGER INDEX : 13.164
FLOATING-POINT INDEX: 18.844
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
|
Intel compiler loses out in 9 benchmarks out of 10, on some benchmarks by a lot. With all the hype surrounding Intel compiler speed I am very surprised and very disappointed.
I'd like to hear from people who benchmarked intel compiler and their system. Please include the CFLAGS, they matter a great deal |
|
Back to top |
|
|
virtual Tux's lil' helper
Joined: 12 Aug 2004 Posts: 132 Location: Bergen
|
Posted: Fri Jun 24, 2005 5:57 pm Post subject: |
|
|
Hi,
Maybe because you do not have an Intel processor but one from AMD. I know they are compatible but their internal architectures differ _________________ The roots of education are bitter but it's fruit is sweet. |
|
Back to top |
|
|
borism n00b
Joined: 24 Oct 2003 Posts: 13 Location: Australia
|
Posted: Sun Jun 26, 2005 12:27 am Post subject: |
|
|
virtual wrote: | Hi,
Maybe because you do not have an Intel processor but one from AMD. I know they are compatible but their internal architectures differ |
I don't have a P4 to play with, but I run nbench on a PIII laptop running Minislack. The binaries were compiled on my Athlon XP using the same compilers. Minislack has glibc version 2.3.4 compiled for i486.
To the results:
Code: |
icc -O3 -xK -ipo -Ob2
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 266.24 : 6.83 : 2.24
STRING SORT : 22.072 : 9.86 : 1.53
BITFIELD : 8.6047e+07 : 14.76 : 3.08
FP EMULATION : 22.165 : 10.64 : 2.45
FOURIER : 4868.1 : 5.54 : 3.11
ASSIGNMENT : 4.5063 : 17.15 : 4.45
IDEA : 1040.5 : 15.91 : 4.73
HUFFMAN : 336.1 : 9.32 : 2.98
NEURAL NET : 5.614 : 9.02 : 3.79
LU DECOMPOSITION : 196.16 : 10.16 : 7.34
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 11.517
FLOATING-POINT INDEX: 7.976
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU : GenuineIntel Pentium III (Coppermine) 499MHz
L2 Cache : 256 KB
OS : Linux 2.6.11.6
C compiler : /opt/intel/compiler81/bin/icc
libc :
MEMORY INDEX : 2.756
INTEGER INDEX : 2.966
FLOATING-POINT INDEX: 4.424
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
|
Code: |
gcc -O3 -fomit-frame-pointer -ffast-math -funroll-loops -march=pentium3
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 398.56 : 10.22 : 3.36
STRING SORT : 22.373 : 10.00 : 1.55
BITFIELD : 8.341e+07 : 14.31 : 2.99
FP EMULATION : 30.988 : 14.87 : 3.43
FOURIER : 5070.1 : 5.77 : 3.24
ASSIGNMENT : 4.1961 : 15.97 : 4.14
IDEA : 807.4 : 12.35 : 3.67
HUFFMAN : 360.59 : 10.00 : 3.19
NEURAL NET : 5.7312 : 9.21 : 3.87
LU DECOMPOSITION : 268 : 13.88 : 10.03
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 12.311
FLOATING-POINT INDEX: 9.033
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU : GenuineIntel Pentium III (Coppermine) 499MHz
L2 Cache : 256 KB
OS : Linux 2.6.11.6
MEMORY INDEX : 2.675
INTEGER INDEX : 3.408
FLOATING-POINT INDEX: 5.010
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder. |
This time Intel compiler manages to win 3 tests out of 10, but still falls way short of gcc3.4 optimising quality on previous generation Intel processor.
Would anyone with a P4 be interested to compare these compilers?
I guess I won't be compiling anything with icc until August when free version 9.0 becomes available. |
|
Back to top |
|
|
madmango Guru
Joined: 15 Jul 2003 Posts: 507 Location: PA, USA
|
Posted: Tue Jul 12, 2005 4:31 pm Post subject: |
|
|
If you read the charges that AMD is bringing against Intel, they claim that icc specifically uses a poorly-optimized code path when the code is compiled on an AMD processor. _________________ word. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|