View previous topic :: View next topic |
Author |
Message |
taskara Advocate
Joined: 10 Apr 2002 Posts: 3763 Location: Australia
|
Posted: Fri Sep 24, 2004 4:07 am Post subject: |
|
|
Thanks for your help _________________ Kororaa install method - have Gentoo up and running quickly and easily, fully automated with an installer! |
|
Back to top |
|
|
Crocodil Apprentice
Joined: 06 May 2004 Posts: 163 Location: Poznan, Poland
|
Posted: Wed Sep 29, 2004 8:42 am Post subject: |
|
|
Hi
I've done the test through the last 2 days and here are my "Yes" answers:
Quote: | 40.9 | Yes | -fforce-mem (-O2)
35.8 | Yes | -fno-if-conversion2 (! -O1)
34.6 | Yes | -finline-functions (-O3)
31.9 | Yes | -fno-thread-jumps (! -O1)
30.9 | Yes | -mno-align-stringops
30.9 | Yes | -falign-loops (-O2 GCC 3.3)
29.2 | Yes | -freorder-blocks (-O2)
29.1 | Yes | -fno-merge-constants (! -O1)
27.9 | Yes | -fdelete-null-pointer-checks (-O2)
27.8 | Yes | -malign-double
27.6 | Yes | -minline-all-stringops
27.2 | Yes | -fsched-spec (-O2 GCC 3.3)
25.3 | Yes | -fno-defer-pop (! -O1)
25.2 | Yes | -fno-delayed-branch (! -O1)
24.7 | Yes | -finline-limit
24.6 | Yes | -fno-cprop-registers (! -O1)
|
Are any of these too unsafe to use?
Thank you very much Hypnos for the scripts!!
I also have this idea / question to you...
Maybe you could edit your first post and add to it the scripts for running and getting results from Acovea (both on gcc -3.3.x and gcc-3.4.x)? This topic has gotten quite long and finding them isn't an easy task
Also it would be great if you could add a small "database" of flags that you know are unsafe to use....
This would make a small "HOWTO" out of the first post
I hope you think over my propositions
Best regards,
Croodil. |
|
Back to top |
|
|
Hypnos Advocate
Joined: 18 Jul 2002 Posts: 2889 Location: Omnipresent
|
Posted: Wed Sep 29, 2004 8:50 am Post subject: |
|
|
Crocodil wrote: | Are any of these too unsafe to use? |
"-malign-double" can cause significant breakage, but the rest look okay.
Quote: | I hope you think over my propositions |
When I have time _________________ Personal overlay | Simple backup scheme |
|
Back to top |
|
|
celtx n00b
Joined: 28 Sep 2004 Posts: 3
|
Posted: Fri Oct 01, 2004 6:05 pm Post subject: my results athlon-xp |
|
|
I ran the acovea test suite over several days one or two tests at a time.
used GCC 3.3 on an athon-xp system.
My configuration :
copied the opteron files
modified the march flag
-march=athlon-xp
and added in changes to comment out known unsafe items.
<!-- Options specific to "fast math" -->
<!-- flag type="simple" value="-fno-math-errno" /-->
<!-- flag type="simple" value="-funsafe-math-optimizations" /-->
<!-- flag type="simple" value="-fno-trapping-math" /-->
<!-- flag type="simple" value="-ffinite-math-only" /-->
<!--flag type="simple" value="-fno-signaling-nans" /-->
<!-- flag type="simple" value="-mieee-fp" /-->
<!-- flag type="simple" value="-fnew-ra" /-->
here were the Yes answers
34.3 | Yes | -fsched-spec (-O2 GCC 3.3)
28.2 | Yes | -falign-loops (-O2 GCC 3.3)
27.7 | Yes | -fdelete-null-pointer-checks (-O2)
26.2 | Yes | -freorder-functions (-O2 GCC 3.3)
25.2 | Yes | -freorder-blocks (-O2)
24.1 | Yes | -fno-merge-constants (! -O1)
which translates to
-O1 -march=athlon=xp -fscehd-spec -falign-loops -fdelete-null-pointer-checks -freorder-functions -freorder-blocks -fno-merge-constants -pipe
by default I was already using -O2 which covers majority of the items above
a few flags contained in -O2 were listed as No
-frerun-loop-opt
-fforce-mem
-fschedule-insns
-fschedule-insns2
-fregmove
-fstrict-aliasing
-falign-labels
the rest were all listed as Maybe
the test showed that on my system adding -fno-merge-constants
to the default of -O2 may help and that a few -O2 flags may not help that much.
So the default -O2 -march=athlon-xp -pipe is a good choice already on this system. |
|
Back to top |
|
|
galay2 Apprentice
Joined: 02 Feb 2004 Posts: 208
|
Posted: Mon Oct 04, 2004 7:20 pm Post subject: |
|
|
I copied the script form page 2 and whenever I run the first script (named as start), I get "Could not find that benchmark, so here's some help.". How come? I have these files in
Code: | ls /usr/share/acovea/benchmarks/
almabench.c fftbench.c linbench.c treebench.c evobench.c huffbench.c mat1bench.c
|
Heres the script I use
Code: | #!/bin/sh
BENCHES="alma evo fft huff lin mat1 tree"
for bench in $BENCHES; do
echo ""
echo "*** $bench ***"
echo ${bench}bench.c
time runacovea -config gcc33_pentium4.acovea -bench ${bench}bench.c\
1> ${bench}.run 2> ${bench}.err
done
|
Also I can run the command itself
Code: |
time runacovea -config gcc33_pentium4.acovea -bench fftbench.c 1> ${bench}.run 2> ${bench}.err
|
|
|
Back to top |
|
|
Hypnos Advocate
Joined: 18 Jul 2002 Posts: 2889 Location: Omnipresent
|
Posted: Mon Oct 04, 2004 8:20 pm Post subject: |
|
|
galay2,
try putting the command in the script all on one line as you did standalone.
Also, make sure there are no trailing spaces after the "\" in the script. _________________ Personal overlay | Simple backup scheme |
|
Back to top |
|
|
galay2 Apprentice
Joined: 02 Feb 2004 Posts: 208
|
Posted: Wed Oct 06, 2004 3:08 pm Post subject: |
|
|
Hypnos: I got it working thx but I got a question. Are you not supposed to run anything while it's running? And also turn of all services/X etc? Coz if its based on the time as a means of scoring, wouldnt other applications affect its benchmarks?
Also, how accurate is a GA in general? Its not exactly a exact answer is it?
Lastly is it possible to resume? I'm assuming each test can be run indivudialy? and the result script wont show anything until all tests are done? (coz I'm getting all 'no's after alma evo and fft)
Thanks |
|
Back to top |
|
|
stikboy n00b
Joined: 14 Sep 2003 Posts: 68 Location: Colorado
|
Posted: Thu Oct 07, 2004 4:45 am Post subject: |
|
|
I am currently running the scripts to see what flags I could change to. My question is this:
After I finish running these tests, I am going to re-install Gentoo. This is so I can use reiser4, nptl and udev. I'm wondering, since everything else I am going to install after the new base install will be the same as I currently have, will the CFLAGS that are recommended by these scripts be usable on the next install? or will reiser4 and/or nptl skew the results a lot? |
|
Back to top |
|
|
Deranger Veteran
Joined: 26 Aug 2004 Posts: 1215
|
Posted: Thu Oct 07, 2004 6:34 am Post subject: |
|
|
stikboy wrote: | I am currently running the scripts to see what flags I could change to. My question is this:
After I finish running these tests, I am going to re-install Gentoo. This is so I can use reiser4, nptl and udev. I'm wondering, since everything else I am going to install after the new base install will be the same as I currently have, will the CFLAGS that are recommended by these scripts be usable on the next install? or will reiser4 and/or nptl skew the results a lot? |
Those C(XX)FLAGS are very, very optimized, based on your current system. I would say if you run these tests under your new system, you'll get different results. But these tests give a good overview of C(XX)FLAGS you should use. |
|
Back to top |
|
|
TheKat n00b
Joined: 24 Jan 2004 Posts: 49
|
Posted: Sat Oct 09, 2004 6:48 pm Post subject: "Safe" to use system while acovea runs? |
|
|
So here's my question, basic as it is.
I currently have Acovea running on my system and am using a roommate's.
If I use my system to actually do stuf while Acovea runs, will I skew the results at all? Does Acovea perform it's own, CPU time independent analysis, or is it best to just leave a system running until it's done, and not put any other stress at all on it? |
|
Back to top |
|
|
SirRichard n00b
Joined: 10 Oct 2004 Posts: 13 Location: Stockelsdorf, Germany
|
Posted: Sun Oct 10, 2004 2:51 pm Post subject: |
|
|
And here are MY results - calculated on my system (Dual Athlon-MP 2800+, 2G RAM, gcc 3.4.2 with modified config for this system):
Code: |
Score | So? | Switch (annotation)
------------------------------------------------------------------------------
35.5 | Yes | -finline-functions (-O3)
31.9 | Yes | -ftracer
31.7 | Yes | -fno-merge-constants (! -O1)
31.3 | Maybe | -fgcse (-O2)
31.1 | Yes | -fno-defer-pop (! -O1)
30.6 | Yes | -fsched-interblock (-O2 GCC 3.3)
29.4 | Maybe | -fweb
29.4 | Maybe | -ffinite-math-only (fast math)
28.5 | Yes | -fno-delayed-branch (! -O1)
28.1 | Yes | -finline-limit
27.2 | Yes | -frerun-cse-after-loop (-O2)
27.0 | Maybe | -fno-math-errno (fast math)
26.4 | Maybe | -fstrength-reduce (-O2)
25.9 | Maybe | -funsafe-math-optimizations (fast math)
25.8 | Yes | -fno-thread-jumps (! -O1)
25.8 | Maybe | -fcaller-saves (-O2)
25.5 | Yes | -falign-functions
25.3 | Maybe | -mieee-fp
25.3 | Maybe | -fpeel-loops
25.1 | Maybe | -funswitch-loops
24.9 | No | -fstrict-aliasing (-O2)
24.3 | Yes | -fdelete-null-pointer-checks (-O2)
24.2 | Yes | -mno-align-stringops
24.2 | Yes | -fsched-spec (-O2 GCC 3.3)
24.1 | Maybe | -fexpensive-optimizations (-O2)
23.5 | Yes | -foptimize-sibling-calls (-O2)
23.2 | Maybe | -malign-double
23.2 | Yes | -fcse-skip-blocks (-O2)
23.0 | Maybe | -fno-signaling-nans (fast math)
22.9 | Maybe | -maccumulate-outgoing-args
22.6 | Maybe | -falign-loops (-O2 GCC 3.3)
22.3 | Maybe | -minline-all-stringops
21.8 | Maybe | -fno-crossjumping (! -O1)
21.8 | Maybe | -fschedule-insns2 (-O2)
21.5 | Maybe | -falign-jumps (-O2 GCC 3.3)
20.8 | Maybe | -frerun-loop-opt (-O2)
20.8 | Yes | -fmove-all-movables
20.7 | Maybe | -freorder-blocks (-O2)
20.7 | Maybe | -fno-cprop-registers (! -O1)
20.4 | Maybe | -fno-if-conversion2 (! -O1)
20.2 | Maybe | -falign-labels (-O2 GCC 3.3)
20.1 | No | -fprefetch-loop-arrays
18.6 | Maybe | -mno-push-args
18.1 | No | -fno-trapping-math (fast math)
17.8 | Maybe | -fno-omit-frame-pointer (! -O1)
17.5 | No | -fforce-mem (-O2)
17.2 | Maybe | -fregmove (-O2)
17.0 | Maybe | -fcse-follow-jumps (-O2)
16.5 | Maybe | -frename-registers (-O3)
16.2 | Maybe | -freduce-all-givs
15.6 | No | -fomit-frame-pointer
15.6 | No | -fno-if-conversion (! -O1)
15.2 | No | -funit-at-a-time
13.7 | No | -freorder-functions (-O2 GCC 3.3)
13.5 | No | -fpeephole2 (-O2)
12.2 | Maybe | -fbranch-target-load-optimize2
11.6 | No | -fnew-ra
9.7 | No | -fno-inline
8.0 | No | -fschedule-insns (-O2)
6.6 | No | -ffloat-store
0.0 | No | -fno-guess-branch-probability (! -O1)
0.0 | No | -fno-loop-optimize (! -O1)
0.0 | No | -funroll-loops
0.0 | No | -funroll-all-loops
0.0 | No | -fbranch-target-load-optimize
0.0 | No | -mfpmath=387
0.0 | No | -mfpmath=sse
0.0 | No | -mfpmath=sse,387
0.0 | No | -momit-leaf-frame-pointer
|
I don't know how much it differs from single-CPU Athlon XP systems, but I wanted to try on myself. The test ran about 26 hours 48 mins and gave me the results above. Now I consider the flags that will be useful and boost up my system. -ftracer goes straight into my CFLAGS, but what about the others? I think and so "-O3 -pipe -march=athlon-mp -ftracer" is right for me (as "-mtune=athlon-mp" is implied by "-march=athlon-mp").
I dislike the thought of using either -O1 and activating some flags or -O3 and disabling some flags. I trust in -O3 creating generally good code (yes, I know, that may be naive). Do these flags like "-fno-merge-constants" really make such a difference? |
|
Back to top |
|
|
Tanisete Guru
Joined: 12 Mar 2004 Posts: 312
|
Posted: Fri Nov 05, 2004 3:36 pm Post subject: |
|
|
I've run the acovea test on my amd 1700+ and afetr the perl script i have this "Yes":
Code: |
Score | So? | Switch (annotation)
------------------------------------------------------------------------------
42.5 | Yes | -fstrict-aliasing (-O2)
30.1 | Yes | -fno-omit-frame-pointer (! -O1)
29.9 | Yes | -fno-defer-pop (! -O1)
29.8 | Yes | -mno-push-args
29.7 | Yes | -fno-trapping-math (fast math)
29.7 | Yes | -ftracer
29.4 | Yes | -fno-delayed-branch (! -O1)
29.2 | Yes | -frerun-cse-after-loop (-O2)
28.1 | Yes | -maccumulate-outgoing-args
28.0 | Yes | -fno-cprop-registers (!
-O1)
27.9 | Yes | -fno-merge-constants (! -O1)
27.8 | Yes | -fno-crossjumping (! -O1)
27.4 | Yes | -falign-jumps (-O2 GCC 3.3)
26.7 | Yes | -malign-double
25.9 | Yes | -fsched-interblock (-O2 GCC 3.3)
25.9 | Yes | -freorder-functions (-O2 GCC 3.3)
25.5 | Yes | -fno-thread-jumps (! -O1)
25.3 | Yes | -mno-align-stringops
25.2 | Yes | -fsched-spec (-O2 GCC 3.3)
25.0 | Yes | -frerun-loop-opt (-O2)
22.7 | Yes | -falign-labels (-O2 GCC 3.3)
|
Which of them could be dangerous or cause an inestable system? I Know that -malign-double is dangerous, but i don't know about the other ones...
Thaks a lot for all the scripts!!! They're great!!!!!!!!!! Very easy to use. |
|
Back to top |
|
|
tnt Veteran
Joined: 27 Feb 2004 Posts: 1222
|
Posted: Wed Nov 10, 2004 11:16 am Post subject: |
|
|
I've run tests and got this results on my Barton box:
Code: | Score | So? | Switch (annotation)
------------------------------------------------------------------------------
41.2 | Yes | -fgcse (-O2)
37.4 | Yes | -finline-functions (-O3)
31.9 | Yes | -mno-push-args
30.1 | Maybe | -fstrict-aliasing (-O2)
30.0 | Maybe | -fno-delayed-branch (! -O1)
29.0 | Yes | -mno-align-stringops
29.0 | Yes | -fno-signaling-nans (fast math)
28.9 | Maybe | -fno-trapping-math (fast math)
28.6 | Maybe | -fmove-all-movables
28.6 | Maybe | -fno-math-errno (fast math)
28.6 | Maybe | -ftracer
27.6 | Maybe | -frename-registers (-O3)
27.5 | Yes | -fno-cprop-registers (! -O1)
27.3 | Yes | -fno-crossjumping (! -O1)
27.3 | Yes | -fexpensive-optimizations (-O2)
26.9 | Maybe | -frerun-cse-after-loop (-O2)
26.6 | Yes | -funswitch-loops
25.8 | Maybe | -fsched-interblock (-O2 GCC 3.3)
24.9 | Maybe | -fweb (-O3 GCC 3.4)
23.9 | No | -fstrength-reduce (-O2)
23.9 | Maybe | -funsafe-math-optimizations (fast math)
23.4 | Maybe | -fno-if-conversion2 (! -O1)
22.9 | Maybe | -fdelete-null-pointer-checks (-O2)
22.7 | Maybe | -fno-omit-frame-pointer (! -O1)
22.2 | No | -fprefetch-loop-arrays
22.2 | Maybe | -maccumulate-outgoing-args
21.8 | Yes | -fno-merge-constants (! -O1)
21.6 | No | -fschedule-insns2 (-O2)
21.6 | No | -fforce-mem (-O2)
21.5 | Maybe | -fno-thread-jumps (! -O1)
21.5 | Maybe | -falign-functions (-O2 GCC 3.4)
21.4 | Maybe | -falign-loops (-O2 GCC 3.3)
21.2 | No | -ffinite-math-only (fast math)
21.2 | Maybe | -finline-limit
20.8 | No | -freduce-all-givs
20.8 | Maybe | -fcse-skip-blocks (-O2)
20.4 | Maybe | -freorder-blocks (-O2)
19.7 | Maybe | -fno-defer-pop (! -O1)
19.5 | No | -funit-at-a-time (-O2 GCC 3.4)
19.1 | Maybe | -fcse-follow-jumps (-O2)
18.8 | Maybe | -fpeel-loops
18.4 | Maybe | -fcaller-saves (-O2)
18.4 | No | -fno-if-conversion (! -O1)
18.1 | Maybe | -fpeephole2 (-O2)
17.8 | No | -foptimize-sibling-calls (-O2)
17.7 | No | -frerun-loop-opt (-O2)
17.5 | No | -mieee-fp
17.1 | No | -falign-jumps (-O2 GCC 3.3)
16.7 | Maybe | -falign-labels (-O2 GCC 3.3)
15.6 | No | -freorder-functions (-O2 GCC 3.3)
15.2 | Maybe | -fsched-spec (-O2 GCC 3.3)
12.3 | No | -fregmove (-O2)
11.5 | No | -fno-inline
11.2 | No | -fschedule-insns (-O2)
11.1 | No | -minline-all-stringops
9.2 | No | -funroll-loops
8.5 | No | -fnew-ra
8.2 | No | -ffloat-store
7.0 | No | -fbranch-target-load-optimize2
6.8 | No | -fno-guess-branch-probability (! -O1)
6.5 | No | -fbranch-target-load-optimize
6.2 | No | -mfpmath=sse
0.0 | No | -fno-loop-optimize (! -O1)
0.0 | No | -funroll-all-loops
0.0 | No | -mfpmath=387
0.0 | No | -mfpmath=sse,387
|
But, I've noticed that some of my old flags are missing - they are nor Yes nor Maybe nor No. What happened with them.
Code: | -pipe
-fomit-frame-pointer
-fforce-addr
-fprefetch-loop-arrays-ffast-math |
Maybe it's because I've changed Opteron's config file? Should I use some other config?
|
|
Back to top |
|
|
nmcsween Guru
Joined: 12 Nov 2003 Posts: 381
|
Posted: Thu Nov 11, 2004 10:18 am Post subject: |
|
|
Hypnos wrote: |
....
Well, in principle, if you run Acovea with nothing else running, you're optimizing towards a dedicated system. You are more likely to optimize for smaller code size in a "dirtier" testing environment.
|
Wouldn't this be best since most environments are dirty, meaning that theres multiple processes running. Or is acovea multithreaded? Which means the environment would already be dirty. _________________ Great Resources |
|
Back to top |
|
|
TiE10 n00b
Joined: 23 Jun 2004 Posts: 59 Location: NA
|
Posted: Fri Nov 12, 2004 7:27 pm Post subject: |
|
|
Sorry if this was already addressed, but I skimmed and I didn't see, so here goes:
I copied the scripts posted earlier in the forum
Code: |
#!/bin/sh
BENCHES="alma evo fft huff lin mat1 tree"
for bench in $BENCHES; do
echo ""
echo "*** $bench ***"
time runacovea -config gcc33_pentium4.acovea -bench ${bench}benc$
1> ${bench}.run 2> ${bench}.err
done
|
and i tried running them, but the output was this:
Code: |
tie10@phoenix4188 ~/acovea $ sh ./acoveascript.sh
*** alma ***
./acoveascript.sh: line 8: alma.run: Permission denied
real 0m0.001s
user 0m0.000s
sys 0m0.001s
*** evo ***
./acoveascript.sh: line 8: evo.run: Permission denied
real 0m0.001s
user 0m0.000s
sys 0m0.001s
*** fft ***
./acoveascript.sh: line 8: fft.run: Permission denied
real 0m0.001s
user 0m0.001s
sys 0m0.000s
*** huff ***
./acoveascript.sh: line 8: huff.run: Permission denied
real 0m0.002s
user 0m0.000s
sys 0m0.000s
*** lin ***
./acoveascript.sh: line 8: lin.run: Permission denied
real 0m0.001s
user 0m0.000s
sys 0m0.001s
*** mat1 ***
./acoveascript.sh: line 8: mat1.run: Permission denied
real 0m0.001s
user 0m0.000s
sys 0m0.001s
*** tree ***
./acoveascript.sh: line 8: tree.run: Permission denied
real 0m0.001s
user 0m0.000s
sys 0m0.001s
|
How can I fix that..? _________________ "Common sense is something earned by thinking." |
|
Back to top |
|
|
TiE10 n00b
Joined: 23 Jun 2004 Posts: 59 Location: NA
|
Posted: Sun Nov 14, 2004 5:56 pm Post subject: |
|
|
bump _________________ "Common sense is something earned by thinking." |
|
Back to top |
|
|
TiE10 n00b
Joined: 23 Jun 2004 Posts: 59 Location: NA
|
Posted: Tue Nov 16, 2004 4:02 am Post subject: |
|
|
bump _________________ "Common sense is something earned by thinking."
Last edited by TiE10 on Tue Nov 16, 2004 5:01 am; edited 1 time in total |
|
Back to top |
|
|
aethyr Veteran
Joined: 06 Apr 2003 Posts: 1085 Location: NYC
|
Posted: Tue Nov 16, 2004 4:17 am Post subject: |
|
|
I have no idea, do you have permission to run it in that directory? How did you install it?
Try this:
Code: | cd ~/
runacovea -config gcc33_pentium4.acovea -bench almabench.c 1> almabench.run 2> almabench.err |
If that works, then make sure you're running it somewhere you can write to.
[edit] I'm not sure you if you just pasted it wrong into the forums, but the script you pasted is sligtly different than the one Hypnos wrote. Make sure you're running the proper script:
Code: | #!/bin/sh
BENCHES="alma evo fft huff lin mat1 tree"
for bench in $BENCHES; do
echo ""
echo "*** $bench ***"
time runacovea -config gcc33_pentium4.acovea -bench ${bench}bench.c\
1> ${bench}.run 2> ${bench}.err
done |
Here's where what you pasted is wrong:
Code: | time runacovea -config gcc33_pentium4.acovea -bench ${bench}benc$
1> ${bench}.run 2> ${bench}.err |
|
|
Back to top |
|
|
TiE10 n00b
Joined: 23 Jun 2004 Posts: 59 Location: NA
|
Posted: Tue Nov 16, 2004 5:00 am Post subject: |
|
|
well i couldn't get the script to work still for some reason. When i did it before i upraded to nitro3, i just copied the scripts and did, for example:
Code: |
tie10@phoenix4188 ~/acovea $ sh ./acoveascript.sh
|
and it ran the tests
I took your suggestion to run it the recommended way and emerged acovea and used your command:
Code: |
runacovea -config gcc33_pentium4.acovea -bench almabench.c 1> almabench.run 2> almabench.err
|
and it seems to be running fine now. thx for the reply
oh yea, and also that was just a mistake in the copying.. the $ where the difference was located was where my terminal cut-off the text , so i was using the right script but what was wrong? which permissions does it need..?well whatever i've got it benchmarking now either way _________________ "Common sense is something earned by thinking." |
|
Back to top |
|
|
Gergan Penkov Veteran
Joined: 17 Jul 2004 Posts: 1464 Location: das kleinste Kuhdorf Deutschlands :)
|
Posted: Wed Nov 17, 2004 11:44 pm Post subject: this program is simply buggy |
|
|
Ok, here it is - I've installed acovea, because I have had a problem in the past - my system crawled, that was after a fresh install - reiserfs4,udev and new cflags :( in the beginning i thought it was the reiserfs4 - yes, it slows down the kernel because of the 8k stack, but that was not the main problem. Then I have also tried without udev all at vain and i had no idea what the hell is going on - from log in to prompt 2 sec, gnome and xfce started so slow - the system was crawling at starting application, it was laggy not slow. Then i have changed the kernel from mm, which is too risky and not stable in the last time to nitro and dev-sources with and without preempt and so on - the result was absolutly nothing, probably a little bit faster with 4k stack. I have checked the harddisk absolutly normal I had a steady:
Code: | # hdparm -Tt /dev/hda
/dev/hda:
Timing cached reads: 608 MB in 2.00 seconds = 303.74 MB/sec
Timing buffered disk reads: 164 MB in 3.03 seconds = 54.12 MB/sec
|
that was the crucial point I understood that the problem was in the cflags, i had march=athlon-xp -O3 -fomit-fram-pointer, nothing fancy - proclaimed to be fast for athlon and stable ::))) and I wanted to optimize ::) installed acovea, waited to the last test - ***glibc detected*** double free or something similar for all the runs in the tree test and at the end segmentation ::) - i was in panic - i thought it was a hardware problem, although the system functions without problem slow but very stable it was not overclocked. Then was the second thought I have changed the config to work for athlon-xp (esp. -fomit-frame-pointer - which is off for all -O levels for athlon-xp and quite right) , than I saw that the test used -fomit-frame-pointer together with -fno-omit-frame-pointer - I thought relaxed that i had found the problem - alas ...
than I have recompiled gcc, glibc and acovea and its dependant libraries simply with -march=athlon-xp -O2 -pipe, this didn't help also, but the result was that the system was not so laggy the compilation was faster ::) than I have read the annoncenment from red hat for the new glibc and here we are :
http://www.fedorazine.com/index.php?option=content&task=view&id=292
for your viewing pleasure. So the question is what will achieve a buggy program, which claims to optimize the binaries and instead is using code which is not secure and have bugs ...! And another question what tests the program in fact the speed of execution - ok but what is the connection between the speed of execution under heavy load (CPU, Disc for example). probably the program is fast but its loading is slow....
The conclusion ... i have learned my lesson and I will not more strive to achieve unreachable speeds.
It's enough that doom3 is faster on linux ....
Have fun |
|
Back to top |
|
|
mekong Tux's lil' helper
Joined: 23 Apr 2004 Posts: 93 Location: Rdam - NL - EU
|
Posted: Thu Nov 18, 2004 5:07 am Post subject: Acovea & Pentium2 |
|
|
I don't see alot people running Acovea on Pentium2, I did it and it took a week
My system:
Code: |
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 3
model name : Pentium II (Klamath)
stepping : 3
cpu MHz : 233.364
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov mmx
bogomips : 459.77
|
Code: |
gcc version 3.4.3 (Gentoo Linux 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)
|
The result:
Code: |
Mean | Std. Dev. | Conf. | Score | So? | Switch (annotation)
------------------------------------------------------------------------------
0.322 | 0.080 | 1.000 | 32.2 | Yes | -fforce-mem (-O2)
0.362 | 0.255 | 0.845 | 30.6 | Maybe | -fstrict-aliasing (-O2)
0.304 | 0.111 | 0.994 | 30.2 | Yes | -foptimize-sibling-calls (-O2)
0.342 | 0.239 | 0.848 | 29.1 | Maybe | -funsafe-math-optimizations (fast math)
0.281 | 0.093 | 0.998 | 28.0 | Yes | -fdelete-null-pointer-checks (-O2)
0.278 | 0.093 | 0.997 | 27.8 | Yes | -fno-defer-pop (! -O1)
0.290 | 0.144 | 0.957 | 27.7 | Yes | -fno-math-errno (fast math)
0.287 | 0.143 | 0.956 | 27.5 | Yes | -freorder-functions (-O2 GCC 3.3)
0.283 | 0.134 | 0.965 | 27.3 | Yes | -ffinite-math-only (fast math)
0.327 | 0.238 | 0.829 | 27.1 | Maybe | -freorder-blocks (-O2)
0.267 | 0.102 | 0.991 | 26.4 | Yes | -fno-crossjumping (! -O1)
0.267 | 0.112 | 0.984 | 26.3 | Yes | -finline-functions (-O3)
0.262 | 0.053 | 1.000 | 26.2 | Yes | -fpeel-loops
0.323 | 0.252 | 0.801 | 25.9 | Maybe | -fstrength-reduce (-O2)
0.264 | 0.117 | 0.976 | 25.8 | Yes | -fno-merge-constants (! -O1)
0.254 | 0.085 | 0.997 | 25.3 | Yes | -fno-cprop-registers (! -O1)
0.293 | 0.198 | 0.862 | 25.3 | Maybe | -fexpensive-optimizations (-O2)
0.268 | 0.144 | 0.938 | 25.2 | Yes | -fno-trapping-math (fast math)
0.248 | 0.081 | 0.998 | 24.8 | Yes | -falign-labels (-O2 GCC 3.3)
0.249 | 0.104 | 0.984 | 24.5 | Yes | -frerun-cse-after-loop (-O2)
0.242 | 0.070 | 0.999 | 24.2 | Yes | -fno-omit-frame-pointer (! -O1)
0.257 | 0.139 | 0.936 | 24.0 | Yes | -fcse-skip-blocks (-O2)
0.264 | 0.157 | 0.908 | 24.0 | Yes | -maccumulate-outgoing-args
0.244 | 0.104 | 0.982 | 24.0 | Yes | -fcse-follow-jumps (-O2)
0.283 | 0.207 | 0.828 | 23.5 | Maybe | -fschedule-insns2 (-O2)
0.249 | 0.133 | 0.939 | 23.4 | Yes | -fno-delayed-branch (! -O1)
0.233 | 0.054 | 1.000 | 23.2 | Yes | -fmove-all-movables
0.257 | 0.157 | 0.898 | 23.0 | Yes | -fno-signaling-nans (fast math)
0.241 | 0.122 | 0.952 | 22.9 | Yes | -mfpmath=387
0.262 | 0.175 | 0.864 | 22.6 | Maybe | -fsched-interblock (-O2 GCC 3.3)
0.226 | 0.075 | 0.997 | 22.5 | Yes | -fno-if-conversion2 (! -O1)
0.226 | 0.086 | 0.991 | 22.4 | Yes | -funit-at-a-time
0.242 | 0.139 | 0.919 | 22.2 | Yes | -fpeephole2 (-O2)
0.253 | 0.168 | 0.867 | 22.0 | Yes | -mieee-fp
0.231 | 0.119 | 0.948 | 21.9 | Yes | -mno-align-stringops
0.227 | 0.109 | 0.962 | 21.8 | Yes | -minline-all-stringops
0.246 | 0.155 | 0.887 | 21.8 | Yes | -falign-functions
0.280 | 0.229 | 0.778 | 21.8 | Maybe | -fweb
0.228 | 0.115 | 0.953 | 21.8 | Yes | -fsched-spec (-O2 GCC 3.3)
0.225 | 0.120 | 0.939 | 21.1 | Yes | -funswitch-loops
0.253 | 0.185 | 0.828 | 21.0 | Maybe | -fcaller-saves (-O2)
0.247 | 0.175 | 0.840 | 20.7 | Maybe | -fno-thread-jumps (! -O1)
0.223 | 0.128 | 0.917 | 20.4 | Yes | -mno-push-args
0.204 | 0.065 | 0.998 | 20.4 | Yes | -frerun-loop-opt (-O2)
0.237 | 0.162 | 0.855 | 20.2 | Maybe | -malign-double
0.291 | 0.298 | 0.671 | 19.5 | No | -fgcse (-O2)
0.260 | 0.227 | 0.747 | 19.4 | Maybe | -fno-if-conversion (! -O1)
0.214 | 0.136 | 0.885 | 19.0 | Yes | -fregmove (-O2)
0.253 | 0.221 | 0.747 | 18.9 | Maybe | -finline-limit
0.212 | 0.133 | 0.890 | 18.8 | Yes | -ftracer
0.257 | 0.241 | 0.714 | 18.3 | Maybe | -falign-jumps (-O2 GCC 3.3)
0.254 | 0.245 | 0.701 | 17.8 | Maybe | -falign-loops (-O2 GCC 3.3)
0.240 | 0.213 | 0.740 | 17.8 | Maybe | -frename-registers (-O3)
0.241 | 0.231 | 0.702 | 16.9 | Maybe | -funroll-all-loops
0.216 | 0.191 | 0.741 | 16.0 | Maybe | -freduce-all-givs
0.206 | 0.171 | 0.771 | 15.9 | Maybe | -fschedule-insns (-O2)
0.226 | 0.235 | 0.662 | 15.0 | No | -fno-inline
0.217 | 0.284 | 0.556 | 12.1 | No | -fnew-ra
0.210 | 0.268 | 0.566 | 11.9 | No | -fomit-frame-pointer
0.181 | 0.218 | 0.594 | 10.7 | No | -funroll-loops
0.130 | 0.153 | 0.605 | 7.9 | No | -fbranch-target-load-optimize
0.147 | 0.224 | 0.488 | 7.2 | No | -fno-loop-optimize (! -O1)
0.057 | 0.158 | 0.000 | 0.0 | No | -fno-guess-branch-probability (! -O1)
0.101 | 0.214 | 0.000 | 0.0 | No | -ffloat-store
0.114 | 0.088 | 0.000 | 0.0 | No | -fbranch-target-load-optimize2
0.098 | 0.115 | 0.000 | 0.0 | No | -momit-leaf-frame-pointer
|
|
|
Back to top |
|
|
mrfree Veteran
Joined: 15 Mar 2003 Posts: 1303 Location: Europe.Italy.Sulmona
|
Posted: Mon Dec 20, 2004 9:06 am Post subject: |
|
|
I post my acovea experience here
Any ideas??? _________________ Please EU, pimp my country!
ICE: /etc/init.d/iptables panic |
|
Back to top |
|
|
Twist Guru
Joined: 03 Jan 2003 Posts: 414 Location: San Diego
|
Posted: Thu Dec 23, 2004 11:05 am Post subject: |
|
|
These were my results, using:
Athlon64 3400+ (used the gcc34_opteron.acovea profile for acovea)
gcc 3.4.3
Tested with Gnome running, no apps active
Code: |
Score | So? | Switch (annotation)
------------------------------------------------------------------------------
35.8 | Yes | -minline-all-stringops
32.6 | Yes | -mno-push-args
31.8 | Maybe | -finline-functions (-O3)
31.8 | Yes | -fexpensive-optimizations (-O2)
30.4 | Maybe | -fschedule-insns (-O2)
30.3 | Maybe | -fpeel-loops
30.1 | Yes | -fno-if-conversion2 (! -O1)
29.8 | Yes | -fno-defer-pop (! -O1)
29.7 | Yes | -fcse-skip-blocks (-O2)
29.1 | Maybe | -frerun-loop-opt (-O2)
28.3 | Yes | -fsched-interblock (-O2 GCC 3.3)
28.2 | Yes | -foptimize-sibling-calls (-O2)
27.4 | Yes | -falign-jumps (-O2 GCC 3.3)
27.4 | Maybe | -fstrict-aliasing (-O2)
26.9 | Maybe | -fno-merge-constants (! -O1)
26.5 | Maybe | -finline-limit
26.1 | Maybe | -falign-functions
25.7 | Maybe | -fno-delayed-branch (! -O1)
25.4 | Maybe | -fpeephole2 (-O2)
25.4 | Maybe | -freorder-functions (-O2 GCC 3.3)
25.0 | Maybe | -fno-signaling-nans (fast math)
25.0 | Maybe | -freorder-blocks (-O2)
24.7 | No | -fstrength-reduce (-O2)
24.4 | Maybe | -frerun-cse-after-loop (-O2)
24.3 | Yes | -fmove-all-movables
24.2 | Maybe | -fcse-follow-jumps (-O2)
23.6 | Maybe | -fschedule-insns2 (-O2)
23.2 | Maybe | -fno-math-errno (fast math)
22.8 | Yes | -fsched-spec (-O2 GCC 3.3)
22.8 | Maybe | -maccumulate-outgoing-args
22.5 | Maybe | -fdelete-null-pointer-checks (-O2)
22.5 | Maybe | -falign-labels (-O2 GCC 3.3)
22.4 | Maybe | -fno-thread-jumps (! -O1)
22.3 | Maybe | -mieee-fp
22.2 | Maybe | -ftracer
22.0 | Maybe | -mno-align-stringops
21.4 | Maybe | -fno-crossjumping (! -O1)
21.3 | Maybe | -fno-cprop-registers (! -O1)
21.3 | Yes | -funit-at-a-time
21.1 | Maybe | -frename-registers (-O3)
20.9 | Maybe | -ffinite-math-only (fast math)
20.8 | Maybe | -fno-trapping-math (fast math)
20.6 | Maybe | -funswitch-loops
20.4 | No | -fweb
20.2 | Maybe | -fcaller-saves (-O2)
20.1 | No | -falign-loops (-O2 GCC 3.3)
19.9 | No | -fgcse (-O2)
19.1 | No | -fno-omit-frame-pointer (! -O1)
17.3 | No | -funsafe-math-optimizations (fast math)
17.1 | No | -fno-if-conversion (! -O1)
15.6 | No | -fregmove (-O2)
15.4 | Maybe | -fbranch-target-load-optimize
15.1 | No | -fprefetch-loop-arrays
13.6 | No | -fnew-ra
13.4 | No | -fno-inline
12.2 | No | -freduce-all-givs
12.2 | No | -funroll-all-loops
11.5 | No | -fforce-mem (-O2)
8.7 | No | -funroll-loops
5.2 | No | -fno-loop-optimize (! -O1)
4.6 | No | -ffloat-store
0.0 | No | -fno-guess-branch-probability (! -O1)
0.0 | No | -fbranch-target-load-optimize2
0.0 | No | -mfpmath=387
0.0 | No | -mfpmath=sse
0.0 | No | -mfpmath=sse,387 |
|
|
Back to top |
|
|
Mango Madness n00b
Joined: 16 Jun 2003 Posts: 6
|
Posted: Sat Dec 25, 2004 9:18 pm Post subject: |
|
|
so has anyone done any real world testing with something like mencoder to see if running acovea by itself or in a "busy" environment would produce better results? AS soon as i get my Christmas hardware going, i'll do some of my own benchmarks...just trying to gather input from others. Merry Christmas! |
|
Back to top |
|
|
Twist Guru
Joined: 03 Jan 2003 Posts: 414 Location: San Diego
|
Posted: Sun Dec 26, 2004 12:57 pm Post subject: |
|
|
I started a new topic to discuss my results using the flags indicated by these scripts, since it is a seperate discussion from this one (this should be more appropriately the usage and support of the aforementioned scripts).
You can see the results here. The very short form I'm afraid to say is that I did not get positive results with Acovea flags. Read the topic for details.
-Twist |
|
Back to top |
|
|
|