Acovea-4.0.0 : Try out my ebuilds (and scripts)

Message

djm · Post by **djm** » Mon May 17, 2004 6:23 pm

-O1 -march=pentium4 -pipe -fno-omit-frame-pointer -finline-functions -fno-delayed-branch -fcse-follow-jumps -mnopush-args -freorder-blocks -falign-jumps -minline-all-stringop -fno-merge-constants -falign-labels -frename-registers -fno-defer-pop -fsched-spec -fno-signaling-nans

Are any of these dangerous? I got -mieee-fp and -malign-double as yes too, but I read that I shouldn't use these

Hypnos · Post by **Hypnos** » Mon May 17, 2004 8:25 pm

metal leper wrote:
Code: Select all
-O1 -march=pentium4 -pipe -fno-omit-frame-pointer -finline-functions -fno-delayed-branch -fcse-follow-jumps -mnopush-args -freorder-blocks -falign-jumps -minline-all-stringop -fno-merge-constants -falign-labels -frename-registers -fno-defer-pop -fsched-spec -fno-signaling-nans
Are any of these dangerous? I got -mieee-fp and -malign-double as yes too, but I read that I shouldn't use these

"-fno-signaling-nans" might screw up math libraries that require proper reporting of floating point exceptions. I would omit it if you do scientific work ...

aethyr · Post by **aethyr** » Tue May 18, 2004 8:46 pm

I wonder if this might make programs compiled with gcc run even faster...

From the mailing list:
http://gcc.gnu.org/ml/gcc/2004-05/msg00031.html

Scott -- I'm curious how mainline performs on all these tests with the
patch below.

Code: Select all

diff -rup orig/egcc-CVS20040502/gcc/c-cppbuiltin.c egcc-CVS20040502/gcc/c-cppbuiltin.c
--- orig/egcc-CVS20040502/gcc/c-cppbuiltin.c	Thu Apr  1 20:03:20 2004
+++ egcc-CVS20040502/gcc/c-cppbuiltin.c	Sun May  2 10:49:49 2004
@@ -384,6 +384,10 @@ c_cpp_builtins (cpp_reader *pfile)
   if (optimize)
     cpp_define (pfile, "__OPTIMIZE__");
 
+  /* GCC's builtins tend to do better than GLIBC's inlines.  */
+  cpp_define (pfile, "__NO_STRING_INLINES");
+  cpp_define (pfile, "__NO_MATH_INLINES");
+  
   if (fast_math_flags_set_p ())
     cpp_define (pfile, "__FAST_MATH__");
   if (flag_really_no_inline)

Results (from the mailing list):

Code: Select all

GCC Mainline, Pentium 4

         default math     with -mfpmath=sse
test  no patch  patched   no patch  patched
----  --------  --------  --------  --------
alma    64.3      42.4      68.2      44.7
 evo    52.9      52.8      79.5      79.8
 fft    34.6      32.0      34.5      33.4
huff    16.0      16.0      16.7      16.3
 lin    26.2      22.9      25.9      23.7
mat1     7.5       7.5       8.0       7.9
mole    32.8      31.3      30.6      22.4
tree    25.0      24.5      25.7      25.6
      --------  --------  --------  --------
total  259.3     229.5     289.0     253.9

Looks good. Anyone brave enough to try? :)

wilburpan · Post by **wilburpan** » Tue May 18, 2004 10:35 pm

Last week I started playing around with gcc-3.4.0, and since I was out of town this weekend I thought I'd do another acovea run. I did a wee bit of hacking because there wasn't a config file for gcc-3.4.0 on a Pentium3.

Here's my results:

Code: Select all

GCC 3.4.0-r2, Pentium III
 Score |  So?  | Switch (annotation)
------------------------------------------------------------------------------
  46.7 |  Yes  | -fstrict-aliasing (-O2)
  35.0 |  Yes  | -mno-align-stringops
  28.9 |  Yes  | -freorder-functions (-O2 GCC 3.3)
  28.7 | Maybe | -funsafe-math-optimizations (fast math)
  28.6 |  Yes  | -minline-all-stringops
  28.2 |  Yes  | -fsched-interblock (-O2 GCC 3.3)
  28.1 | Maybe | -fpeephole2 (-O2)
  28.0 | Maybe | -frename-registers (-O3)
  27.4 |  Yes  | -fno-crossjumping (! -O1)
  27.2 |  Yes  | -falign-functions (-O2 GCC 3.4)
  26.9 | Maybe | -fgcse (-O2)
  26.6 |  Yes  | -fdelete-null-pointer-checks (-O2)
  26.5 | Maybe | -finline-functions (-O3)
  26.1 |  Yes  | -fmove-all-movables
  26.1 | Maybe | -fno-math-errno (fast math)
  26.1 |  Yes  | -fexpensive-optimizations (-O2)
  26.1 | Maybe | -finline-limit
  25.9 |  Yes  | -mno-push-args
  25.6 | Maybe | -fforce-mem (-O2)
  25.4 |  Yes  | -fno-delayed-branch (! -O1)
  25.4 | Maybe | -fpeel-loops
  25.3 | Maybe | -maccumulate-outgoing-args
  25.3 | Maybe | -fno-trapping-math (fast math)
  25.2 | Maybe | -frerun-cse-after-loop (-O2)
  25.1 |  Yes  | -falign-loops (-O2 GCC 3.3)
  24.9 | Maybe | -fno-signaling-nans (fast math)
  24.8 |  Yes  | -fsched-spec (-O2 GCC 3.3)
  24.3 |  Yes  | -malign-double
  23.9 | Maybe | -fcaller-saves (-O2)
  23.0 |  Yes  | -falign-labels (-O2 GCC 3.3)
  23.0 | Maybe | -freorder-blocks (-O2)
  22.8 | Maybe | -frerun-loop-opt (-O2)
  22.8 |  Yes  | -fno-thread-jumps (! -O1)
  22.5 | Maybe | -fcse-follow-jumps (-O2)
  22.2 | Maybe | -fno-merge-constants (! -O1)
  21.6 |   No  | -fweb (-O3 GCC 3.4)
  21.4 |  Yes  | -fno-defer-pop (! -O1)
  21.0 | Maybe | -freduce-all-givs
  20.6 |   No  | -funit-at-a-time (-O2 GCC 3.4)
  20.4 | Maybe | -fcse-skip-blocks (-O2)
  20.1 | Maybe | -foptimize-sibling-calls (-O2)
  19.8 | Maybe | -funswitch-loops
  19.8 | Maybe | -fno-omit-frame-pointer (! -O1)
  19.6 | Maybe | -falign-jumps (-O2 GCC 3.3)
  19.6 | Maybe | -mieee-fp
  19.1 | Maybe | -fregmove (-O2)
  18.0 |   No  | -fprefetch-loop-arrays
  17.8 | Maybe | -fno-if-conversion2 (! -O1)
  17.2 |   No  | -fstrength-reduce (-O2)
  16.5 |   No  | -fno-cprop-registers (! -O1)
  16.5 |   No  | -fno-if-conversion (! -O1)
  15.9 | Maybe | -ffinite-math-only (fast math)
  15.5 |   No  | -ftracer
  14.6 |   No  | -fnew-ra
  14.2 |   No  | -fschedule-insns2 (-O2)
  11.7 |   No  | -fschedule-insns (-O2)
  10.6 |   No  | -ffloat-store
  10.5 |   No  | -funroll-all-loops
   9.3 |   No  | -fno-inline
   7.6 |   No  | -fomit-frame-pointer
   7.4 |   No  | -funroll-loops
   7.3 |   No  | -fno-loop-optimize (! -O1)
   6.2 |   No  | -fbranch-target-load-optimize
   0.0 |   No  | -fno-guess-branch-probability (! -O1)
   0.0 |   No  | -fbranch-target-load-optimize2
   0.0 |   No  | -mfpmath=387
   0.0 |   No  | -mfpmath=sse
   0.0 |   No  | -mfpmath=sse,387
   0.0 |   No  | -momit-leaf-frame-pointer

which suggests:

Code: Select all

CFLAGS="-march=pentium3 -O2 -mno-align-stringops -minline-all-stringops -fno-crossjumping -fmove-all-movables -mno-push-args -fno-delayed-branch -malign-double -fno-thread-jumps -fno-defer-pop -Wall -pipe"

Correct?

Hypnos · Post by **Hypnos** » Wed May 19, 2004 1:54 am

wilburpan wrote:which suggests:

Code: Select all

CFLAGS="-march=pentium3 -O2 -mno-align-stringops -minline-all-stringops -fno-crossjumping -fmove-all-movables -mno-push-args -fno-delayed-branch -malign-double -fno-thread-jumps -fno-defer-pop -Wall -pipe"

Correct?

-malign-double is dangerous -- that can caues binary incompatibility for linking between objects.

Also, be advised that "-Wall" can cause compiles to fail! It's harmless, but sometimes "-Wall" trips a bug in gcc (this caused trouble for me when forcing my flags onto glibc).

Maedhros · Post by **Maedhros** » Thu May 20, 2004 2:14 pm

ikaro wrote:..as a side note, anyone got this running with gcc 3.4 ?
here it just segfaults....

It used to do that here... but not any more...

Athlon 64 3200
gcc-3.4.0-r4
glibc--2.3.3_pre20040420-r1

I haven't had time to get any results from it yet, but at least it runs

el_compa · Post by **el_compa** » Sun May 23, 2004 7:15 pm

wilburpan wrote: which suggests:

Code: Select all

CFLAGS="-march=pentium3 -O2 -mno-align-stringops -minline-all-stringops -fno-crossjumping -fmove-all-movables -mno-push-args -fno-delayed-branch -malign-double -fno-thread-jumps -fno-defer-pop -Wall -pipe"

Correct?

One doubt, if you activate -O2 and then turn off some of the optimizations it carries (like -fcrossjumping), gcc uses the last flag you specified (-fno-crossjumping)

Second, in other thread (emerging sash), I had a problem compiling sash with

Code: Select all

-freduce-all-givs

has anybody else had a problem with that flag?

teedog · Post by **teedog** » Tue May 25, 2004 8:13 pm

Is it normal that my Acovea benchmarks are taking a LONG time? I have a Pentium-M 1.4Ghz machine, and after running Acovea overnight, I'm only on the third benchmark, fft. Would it make the benchmark less accurate if I did other things in the background (like compiling)? Thanks.

aethyr · Post by **aethyr** » Tue May 25, 2004 8:23 pm

teedog wrote:Is it normal that my Acovea benchmarks are taking a LONG time?

Yes.

teedog wrote:Would it make the benchmark less accurate if I did other things in the background (like compiling)? Thanks.

Yes.

You could however pause the application with CTRL-Z and resume it later with "fg".

HeartBreakKid · Post by **HeartBreakKid** » Tue May 25, 2004 9:09 pm

So has anyone who has run acovea thus far gotten a Yes for "-fomit-frame-pointer"? Most of the results I have seen so far appear to suggest that this is not as good as (we all) may have believed in the past. Personally, (P4 1300 Mhz, 284MB) I got better results with "-fno-omit-frame-pointer" and most of the results I have seen appear to agree with this. Anyone care to explain?

Hypnos · Post by **Hypnos** » Tue May 25, 2004 11:24 pm

HeartBreakKid wrote:So has anyone who has run acovea thus far gotten a Yes for "-fomit-frame-pointer"? Most of the results I have seen so far appear to suggest that this is not as good as (we all) may have believed in the past. Personally, (P4 1300 Mhz, 284MB) I got better results with "-fno-omit-frame-pointer" and most of the results I have seen appear to agree with this. Anyone care to explain?

"Register starvation" is a non-issue on late-generation CPUs; it was when floating point and SIMD units were first added to CPU dies with the same old registers.

-fomit-frame-pointer might help if you have heavily hand-optimized code that runs close to the hardware, like the kernel, or code that also uses a lot of special instructions like those in 3DNow.

heriophant · Post by **heriophant** » Wed May 26, 2004 7:12 pm

Ok i ran the acovea using this command [ runacovea -config gcc33_pentium4.acovea -bench huffbench.c ] p4 2.8 using gcc33 and this is the final output.

best options for population 0
gcc -lrt -lm -std=gnu99 -O1 -march=pentium4 -o TEMP -fno-defer-pop -fno-thread-j umps -fno-omit-frame-pointer -fno-if-conversion2 -fno-crossjumping -foptimize-si bling-calls -fcse-skip-blocks -fexpensive-optimizations -frerun-loop-opt -fforce -mem -fpeephole2 -fschedule-insns2 -fstrict-aliasing -freorder-blocks -fsched-in terblock -fsched-spec -falign-jumps -falign-labels -frename-registers -fmove-all -movables -fno-inline -ftracer -fnew-ra -mieee-fp -malign-double -mno-push-args -maccumulate-outgoing-args -mno-align-stringops -minline-all-stringops -mfpmath= sse -fomit-frame-pointer -ffinite-math-only -finline-limit=600 /usr/share/acovea /benchmarks/huffbench.c

best options for population 1
gcc -lrt -lm -std=gnu99 -O1 -march=pentium4 -o TEMP -fno-if-conversion2 -fno-del ayed-branch -fno-crossjumping -foptimize-sibling-calls -fcse-skip-blocks -fexpen sive-optimizations -frerun-cse-after-loop -fcaller-saves -fforce-mem -fschedule- insns2 -fstrict-aliasing -freorder-blocks -fsched-interblock -fsched-spec -freor der-functions -falign-labels -frename-registers -ffloat-store -fmove-all-movable s -freduce-all-givs -fno-inline -ftracer -fnew-ra -mno-push-args -maccumulate-ou tgoing-args -mno-align-stringops -minline-all-stringops -mfpmath=sse -fomit-fram e-pointer -fno-math-errno -funsafe-math-optimizations -fno-trapping-math -ffinit e-math-only -fno-signaling-nans /usr/share/acovea/benchmarks/huffbench.c

best options for population 2
gcc -lrt -lm -std=gnu99 -O1 -march=pentium4 -o TEMP -fno-merge-constants -fno-cp rop-registers -fno-if-conversion2 -fno-loop-optimize -fno-crossjumping -foptimiz e-sibling-calls -fcse-follow-jumps -fcse-skip-blocks -fgcse -fexpensive-optimiza tions -frerun-cse-after-loop -frerun-loop-opt -fschedule-insns2 -fregmove -fdele te-null-pointer-checks -freorder-blocks -fsched-interblock -fsched-spec -falign- jumps -fprefetch-loop-arrays -fmove-all-movables -freduce-all-givs -fno-inline - ftracer -fnew-ra -minline-all-stringops -mfpmath=sse,387 -funsafe-math-optimizat ions -fno-signaling-nans /usr/share/acovea/benchmarks/huffbench.c

best options for population 3
gcc -lrt -lm -std=gnu99 -O1 -march=pentium4 -o TEMP -fno-merge-constants -fno-de fer-pop -fno-thread-jumps -fno-omit-frame-pointer -fno-delayed-branch -fno-loop- optimize -fno-crossjumping -foptimize-sibling-calls -fcse-follow-jumps -fcse-ski p-blocks -fexpensive-optimizations -frerun-cse-after-loop -frerun-loop-opt -fcal ler-saves -fforce-mem -fschedule-insns2 -falign-loops -falign-jumps -finline-fun ctions -frename-registers -ffloat-store -fmove-all-movables -freduce-all-givs -f no-inline -ftracer -fnew-ra -mieee-fp -mno-align-stringops -fomit-frame-pointer -fno-math-errno -fno-trapping-math -fno-signaling-nans -finline-limit=600 /usr/s hare/acovea/benchmarks/huffbench.c

best options for population 4
gcc -lrt -lm -std=gnu99 -O1 -march=pentium4 -o TEMP -fno-thread-jumps -fno-cross jumping -foptimize-sibling-calls -fcse-skip-blocks -fexpensive-optimizations -fr erun-cse-after-loop -fforce-mem -fschedule-insns2 -fregmove -fstrict-aliasing -f reorder-blocks -falign-loops -finline-functions -frename-registers -ffloat-store -fprefetch-loop-arrays -fmove-all-movables -ftracer -fnew-ra -malign-double -mn o-align-stringops -fomit-frame-pointer -fno-math-errno -funsafe-math-optimizatio ns -fno-trapping-math -ffinite-math-only /usr/share/acovea/benchmarks/huffbench. c

common options (found in all populations)
gcc -lrt -lm -std=gnu99 -O1 -march=pentium4 -o TEMP -fno-crossjumping -foptimize -sibling-calls -fcse-skip-blocks -fexpensive-optimizations -fschedule-insns2 -fm ove-all-movables -ftracer -fnew-ra /usr/share/acovea/benchmarks/huffbench.c

rejected options (rejected by all populations)
gcc -lrt -lm -std=gnu99 -O1 -march=pentium4 -o TEMP -fno-guess-branch-probabilit y -fno-if-conversion -fstrength-reduce -fschedule-insns -funroll-loops /usr/shar e/acovea/benchmarks/huffbench.c

Option counts:
-fno-merge-constants 13 15 12 5 3 48
-fno-defer-pop 20 16 7 10 6 59
-fno-thread-jumps 1 9 8 11 9 38
-fno-omit-frame-pointer 12 13 12 9 0 46
-fno-guess-branch-probability 0 3 2 2 0 7
-fno-cprop-registers 6 14 7 7 6 40
-fno-if-conversion 0 0 0 0 0 0
-fno-if-conversion2 13 17 12 6 7 55
-fno-delayed-branch 5 8 15 10 8 46
-fno-loop-optimize 0 16 17 10 2 45
-fno-crossjumping 20 17 13 20 12 82
-foptimize-sibling-calls 17 15 10 9 15 66
-fcse-follow-jumps 11 14 13 14 11 63
-fcse-skip-blocks 18 20 18 20 16 92
-fgcse 12 1 9 1 1 24
-fexpensive-optimizations 19 20 16 12 19 86
-fstrength-reduce 4 8 5 1 3 21
-frerun-cse-after-loop 1 7 12 19 13 52
-frerun-loop-opt 9 5 16 20 3 53
-fcaller-saves 0 16 17 15 8 56
-fforce-mem 12 15 0 5 17 49
-fpeephole2 8 3 4 1 8 24
-fschedule-insns 1 3 4 14 5 27
-fschedule-insns2 19 18 18 18 10 83
-fregmove 16 4 16 15 16 67
-fstrict-aliasing 11 9 6 7 17 50
-fdelete-null-pointer-checks 15 15 17 6 7 60
-freorder-blocks 16 7 9 12 15 59
-fsched-interblock 10 6 14 10 14 54
-fsched-spec 10 4 13 13 10 50
-freorder-functions 2 2 10 5 12 31
-falign-loops 18 18 5 13 15 69
-falign-jumps 11 17 16 14 0 58
-falign-labels 12 13 10 7 15 57
-finline-functions 5 15 3 14 12 49
-frename-registers 11 13 9 6 14 53
-ffloat-store 8 18 3 16 17 62
-fprefetch-loop-arrays 8 11 13 10 8 50
-fmove-all-movables 14 14 17 20 20 85
-freduce-all-givs 13 15 10 13 9 60
-fno-inline 19 16 10 20 8 73
-ftracer 18 11 17 18 20 84
-fnew-ra 20 17 16 5 16 74
-funroll-loops 3 0 0 1 0 4
-funroll-all-loops 0 13 8 2 5 28
-mieee-fp 9 4 11 8 9 41
-malign-double 5 10 8 6 6 35
-mno-push-args 9 7 8 1 2 27
-maccumulate-outgoing-args 2 7 4 3 16 32
-mno-align-stringops 6 17 8 16 10 57
-minline-all-stringops 14 11 11 15 8 59
-mfpmath=387 0 7 0 1 0 8
-mfpmath=sse 8 2 9 6 6 31
-mfpmath=sse,387 1 10 5 0 0 16
-fomit-frame-pointer 2 15 16 20 10 63
-momit-leaf-frame-pointer 1 3 0 0 0 4
-fno-math-errno 0 19 11 14 9 53
-funsafe-math-optimizations 7 4 10 12 13 46
-fno-trapping-math 10 10 12 10 10 52
-ffinite-math-only 4 7 9 11 12 43
-fno-signaling-nans 2 10 14 15 11 52
-finline-limit 16 14 3 9 5 47

value options:
-finline-limit: 600 0 0 600 0 , average = 600 across 2 populat ions

run complete time: 2004 May 25 03:30:35

optimistic options:
-fno-crossjumping (1.59)
-fcse-skip-blocks (2.064)
-fexpensive-optimizations (1.779)
-fschedule-insns2 (1.637)
-fmove-all-movables (1.732)
-fno-inline (1.163)
-ftracer (1.684)
-fnew-ra (1.21)

pessimistic options:
-fno-guess-branch-probability (-1.966)
-fno-if-conversion (-2.298)
-fgcse (-1.16)
-fstrength-reduce (-1.302)
-fpeephole2 (-1.16)
-fschedule-insns (-1.018)
-funroll-loops (-2.108)
-mno-push-args (-1.018)
-mfpmath=387 (-1.918)
-mfpmath=sse,387 (-1.539)
-momit-leaf-frame-pointer (-2.108)

Now my question is what should my flags be set to?

I run a ~x86 system with gnome, ximian openoffice, xmms, mplayer, ximian evolution and a few misc progs.

Any help would be greatly apprecieted. Also i plan on doing a fresh install with the new settings. Will any of these flags break bootstrapping?

Being a Linux NooB for the most part, I also am wondering what ppl are using for there -O3 portion of the flags. Im looking for the fastest system possible

heriophant · Post by **heriophant** » Wed May 26, 2004 7:23 pm

Ok i just finished most of the thread and now realize there are more benchs that need to be read.

If the command i used in my first test only ran the huffbench, what is the command to run all the tests and then get the final output in a yes no maybe format?

Thanks a ton in advance

Heriophant

Hypnos · Post by **Hypnos** » Wed May 26, 2004 10:47 pm

heriophant wrote:Ok i just finished most of the thread and now realize there are more benchs that need to be read.

If the command i used in my first test only ran the huffbench, what is the command to run all the tests and then get the final output in a yes no maybe format?

Thanks a ton in advance

Heriophant

My scripts to run all the benches and process the results are near the top of the thread.

Angel666 · Post by **Angel666** » Thu May 27, 2004 2:10 pm

I compiled acovea with GCC 3.4.0-r4 from portage, which has been my main compiler for a month now, and it always segfaults after about 30 seconds.

Here is what happens:

Code: Select all

*** alma ***
./acovea.bench: line 10:  7006 Segmentation fault      runacovea -config gcc34_pentium4.acovea -bench ${bench}bench.c >${bench}.run 2>${bench}.err

real    0m13.828s
user    0m13.116s
sys     0m0.716s

*** evo ***
./acovea.bench: line 10:  7435 Segmentation fault      runacovea -config gcc34_pentium4.acovea -bench ${bench}bench.c >${bench}.run 2>${bench}.err

real    0m34.124s
user    0m33.361s
sys     0m0.759s

*** fft ***
./acovea.bench: line 10:  7865 Segmentation fault      runacovea -config gcc34_pentium4.acovea -bench ${bench}bench.c >${bench}.run 2>${bench}.err

real    0m9.894s
user    0m9.121s
sys     0m0.773s

*** huff ***
./acovea.bench: line 10:  8302 Segmentation fault      runacovea -config gcc34_pentium4.acovea -bench ${bench}bench.c >${bench}.run 2>${bench}.err

real    0m25.244s
user    0m23.917s
sys     0m1.259s
*** lin ***
./acovea.bench: line 10:  8734 Segmentation fault      runacovea -config gcc34_pentium4.acovea -bench ${bench}bench.c >${bench}.run 2>${bench}.err

real    0m12.139s
user    0m11.356s
sys     0m0.790s

*** mat1 ***
./acovea.bench: line 10:  9158 Segmentation fault      runacovea -config gcc34_pentium4.acovea -bench ${bench}bench.c >${bench}.run 2>${bench}.err

real    0m18.248s
user    0m17.422s
sys     0m0.832s

*** tree ***

real    266m22.069s
user    263m50.781s
sys     2m8.694s

The only benchmark that runs, strangely, is treebench. I have no idea why this is, and so i will try to recompile acovea with 3.3 and see what happens.

heriophant · Post by **heriophant** » Fri May 28, 2004 2:46 am

teedog · Post by **teedog** » Fri May 28, 2004 4:53 am

poisson wrote:I found another interesting Acovea application: what are the best optimizations for pentium-m? You know, such processor is an hybrid between pentium3 and pentium4, with 1M L2 cache. I started with "alma", but I don't like to stress my laptop

Did anyone manage to finish benchmarking with a Pentium-M machine? I'd be very interested in knowning about the results.

Also, would pausing the benchmark using CTRL-Z work since realtime is used? I found that after I paused and resumed a benchmark, the resulting realtime value includes the time when the benchmark was paused.

teedog · Post by **teedog** » Fri May 28, 2004 5:56 am

Hypnos wrote:
HeartBreakKid wrote:So has anyone who has run acovea thus far gotten a Yes for "-fomit-frame-pointer"? Most of the results I have seen so far appear to suggest that this is not as good as (we all) may have believed in the past. Personally, (P4 1300 Mhz, 284MB) I got better results with "-fno-omit-frame-pointer" and most of the results I have seen appear to agree with this. Anyone care to explain?
"Register starvation" is a non-issue on late-generation CPUs; it was when floating point and SIMD units were first added to CPU dies with the same old registers.

-fomit-frame-pointer might help if you have heavily hand-optimized code that runs close to the hardware, like the kernel, or code that also uses a lot of special instructions like those in 3DNow.

Does it hurt to include -fomit-frame-pointer though if I don't need to do any debugging? Everyone seems to say that -fomit-frame-pointer is one of the params that none would want to remove.

Hypnos · Post by **Hypnos** » Fri May 28, 2004 9:27 pm

teedog wrote:Does it hurt to include -fomit-frame-pointer though if I don't need to do any debugging? Everyone seems to say that -fomit-frame-pointer is one of the params that none would want to remove.

-fomit-frame-pointer seems to introduce some overhead (according to the results here) and can cause some builds to break (Gentoo devs try to filter it out where it does).

I don't use it anymore.

Hypnos · Post by **Hypnos** » Fri May 28, 2004 9:30 pm

heriophant wrote:
this is my start anyone care to finish it based on the results.

CFLAGS="-O3 -march=pentium4 -pipe

Thanks a ton in the hand holding until i learn what this all is and about

Right, so now add the flags graded "Yes" and annotated "(! O1)" -- this will turn off detrimental optimizations implied by "-O1". Also, you might want to add "-mcpu=pentium4" even though it's redundant to "-march=pentium4", as the latter is filtered out by a number of ebuilds.

heriophant · Post by **heriophant** » Sat May 29, 2004 12:00 pm

Ok just wanted to be clear as to your answer if im understanding you correctly this is how my flags should be set

with all the "yes" from the results.

CFLAGS="-O3 -march=pentium4 -mcpu=pentium4 -pipe -fpeephole2 -foptimize-sibling-calls -fno-thread-jumps -falign-jumps -fcse-skip-blocks -fno-omit-frame-pointer -fdelete-null-pointer-checks -fregmove -malign-double -fno-merge-constants -fno-delayed-branch -frerun-cse-after-loop -ffinite-math-only -frename-registers -freorder-functions"

or should it ONLY be the yes that flag !-O1

Heriophant

Hypnos · Post by **Hypnos** » Sat May 29, 2004 7:50 pm

heriophant wrote:Ok just wanted to be clear as to your answer if im understanding you correctly this is how my flags should be set

with all the "yes" from the results.

CFLAGS="-O3 -march=pentium4 -mcpu=pentium4 -pipe -fpeephole2 -foptimize-sibling-calls -fno-thread-jumps -falign-jumps -fcse-skip-blocks -fno-omit-frame-pointer -fdelete-null-pointer-checks -fregmove -malign-double -fno-merge-constants -fno-delayed-branch -frerun-cse-after-loop -ffinite-math-only -frename-registers -freorder-functions"

or should it ONLY be the yes that flag !-O1

The latter (i.e., ONLY yes + ! O1). The CFLAGS above, you've introduced redundant flags (e.g., -fpeephole2 is implied by -O2, which is implied by -O3) and some dangerous flags (e.g., -malign-double).

shanghai · Post by **shanghai** » Sun May 30, 2004 11:48 pm

Hi everyone.
I have a Duron (Acer Aspire 1300), with 256 RAM.
While the first test (alma) finished in four hours, all the others seems to last like 7-9 hours.
Is it possible, is there anything wrong?
I installed acovea from the portage, but then I went on init 3 and killed every service before launching acovea (and then I launched it with -18 niceness, all these to avoid the time tests error you talk in this post of), and I hacked the configs to match the laptop (athlon-xp). Before losing three other days to finish it, am I doing something wrong?

Thank you guys

S.

wilburpan · Post by **wilburpan** » Mon May 31, 2004 12:00 am

The times that I have run acovea, the benchmarks did vary widely in terms of the time required to run each benchmark, with some taking more than twice as long as the shortest test, so your experience does not seem totally out of whack to me.

However, it seems that if acovea is to give you good results, you really need to not have the computer do anything but run acovea. Earlier in this thread you can find my experience running acovea on my laptop from the command line, and another run where I was running a desktop environment at the same time. I got very different results between the two runs. It seems that for accurate results, you really cannot have anything else running at the same time.

For the most accurate results, I would recommend starting it on Friday night and finding something else to do for the weekend.

shanghai · Post by **shanghai** » Mon May 31, 2004 12:07 am

Thanks so much

(side note: oh, shit. I just discovered all the system services i thought i stopped were running. Anyways, no cron job is up on the pc, no X server was used even if started and i hope so much syslog had nothing to write. Maybe that's way it took all that time

Anyways, the job was running as nice -n -18 ... let's hope).

Acovea-4.0.0 : Try out my ebuilds (and scripts)

results help

re

Re: re

re

Re: re

re

Re: re