| View previous topic :: View next topic |
| Author |
Message |
Hypnos Veteran


Joined: 18 Jul 2002 Posts: 1880 Location: Bay Area/Tokyo
|
Posted: Tue Apr 13, 2004 11:58 pm Post subject: |
|
|
| Ultraoctane.com wrote: | | I think it would be better if it was inputed directly via a upload. A web interface seems like a burden on whoever runs the test.We could load it with a perl socket connection. Although I think I need root access on the server to use perl sockets. If anyone knows of a better way please post. |
You can just use an unprivelged port, which is >1024 I believe.
I don't know of any good way to get the user CPU info, or validate the benchmark output (maybe this should be done by Acovea). _________________ ~ Lenovo Thinkpad T61 w/ GM965 -- (most) everything works!
~ Étoilé, a document- and project-driven desktop (in the GNUstep overlay) |
|
| Back to top |
|
 |
nmcsween Guru


Joined: 12 Nov 2003 Posts: 381
|
Posted: Wed Apr 14, 2004 12:34 am Post subject: |
|
|
| We could get the cpu info from /proc/cpuinfo and attach it to the file being sen t then run a php script checking if both the /proc/cpuinfo and the info in test results are the same. |
|
| Back to top |
|
 |
Hypnos Veteran


Joined: 18 Jul 2002 Posts: 1880 Location: Bay Area/Tokyo
|
Posted: Wed Apr 14, 2004 12:49 am Post subject: |
|
|
| Ultraoctane.com wrote: | | We could get the cpu info from /proc/cpuinfo and attach it to the file being sen t then run a php script checking if both the /proc/cpuinfo and the info in test results are the same. |
That might work; isn't /proc/cpuinfo deprecated? _________________ ~ Lenovo Thinkpad T61 w/ GM965 -- (most) everything works!
~ Étoilé, a document- and project-driven desktop (in the GNUstep overlay) |
|
| Back to top |
|
 |
nmcsween Guru


Joined: 12 Nov 2003 Posts: 381
|
Posted: Wed Apr 14, 2004 1:11 am Post subject: |
|
|
| It was going to be deprecated in 2.5.4 but it seems to still be around. We could always use a fallback if /proc/cpuinfo is gone on a machine. What is /proc/cpuinfo getting replaced with? |
|
| Back to top |
|
 |
Hypnos Veteran


Joined: 18 Jul 2002 Posts: 1880 Location: Bay Area/Tokyo
|
Posted: Wed Apr 14, 2004 1:43 am Post subject: |
|
|
| Ultraoctane.com wrote: | | It was going to be deprecated in 2.5.4 but it seems to still be around. We could always use a fallback if /proc/cpuinfo is gone on a machine. What is /proc/cpuinfo getting replaced with? |
Something in a device tree in /sys/ _________________ ~ Lenovo Thinkpad T61 w/ GM965 -- (most) everything works!
~ Étoilé, a document- and project-driven desktop (in the GNUstep overlay) |
|
| Back to top |
|
 |
wilburpan l33t


Joined: 21 Jan 2003 Posts: 959
|
Posted: Wed Apr 14, 2004 12:36 pm Post subject: |
|
|
| Hypnos wrote: | | wilburpan wrote: | How long does it take to run the two acovea scripts (which I very imaginatively named acovea1 and acovea2, by the way )? I'm using a 700 MHz P3. |
The first bash script, ~48 hours on my 1.6GHz P4; the second Perl script, a few milliseconds. |
Yikes. I guess I know what my laptop will be doing over the weekend.
One other question: does the acovea script need to be run without any other processes going on at the same time? For example, if I'm running the acovea script from a terminal in KDE, will that give me different results than running it in console mode? _________________ I'm only hanging out in OTW until I get rid of this stupid l33t ranking.....Crap. That didn't work. |
|
| Back to top |
|
 |
Hypnos Veteran


Joined: 18 Jul 2002 Posts: 1880 Location: Bay Area/Tokyo
|
Posted: Wed Apr 14, 2004 1:27 pm Post subject: |
|
|
| wilburpan wrote: | | One other question: does the acovea script need to be run without any other processes going on at the same time? For example, if I'm running the acovea script from a terminal in KDE, will that give me different results than running it in console mode? |
Well, evolutionary fitness is relative, so any additional load should be consistent or not correlated in time.
Acovea does use real time, not CPU time ... _________________ ~ Lenovo Thinkpad T61 w/ GM965 -- (most) everything works!
~ Étoilé, a document- and project-driven desktop (in the GNUstep overlay) |
|
| Back to top |
|
 |
wilburpan l33t


Joined: 21 Jan 2003 Posts: 959
|
Posted: Fri Apr 16, 2004 12:05 am Post subject: |
|
|
Is there any reason to believe that two different computers with the same CPU would generate different acovea results? _________________ I'm only hanging out in OTW until I get rid of this stupid l33t ranking.....Crap. That didn't work. |
|
| Back to top |
|
 |
Hypnos Veteran


Joined: 18 Jul 2002 Posts: 1880 Location: Bay Area/Tokyo
|
Posted: Fri Apr 16, 2004 12:16 am Post subject: |
|
|
| wilburpan wrote: | | Is there any reason to believe that two different computers with the same CPU would generate different acovea results? |
Yes -- depends on your motherboard and RAM, too, I would think. _________________ ~ Lenovo Thinkpad T61 w/ GM965 -- (most) everything works!
~ Étoilé, a document- and project-driven desktop (in the GNUstep overlay) |
|
| Back to top |
|
 |
Robe n00b


Joined: 05 Jan 2004 Posts: 64
|
Posted: Fri Apr 16, 2004 6:43 am Post subject: results |
|
|
Just ran the bash and perl scrits . ~36 hours. (Excellent work on both BTW). here are my results.
| Code: | Score | So? | Switch (annotation)
------------------------------------------------------------------------------
36.2 | Maybe | -fforce-mem (-O2)
30.2 | Yes | -fno-omit-frame-pointer (! -O1)
29.3 | Yes | -fsched-spec (-O2 GCC 3.3)
29.1 | Yes | -minline-all-stringops
28.7 | Yes | -fcaller-saves (-O2)
27.6 | Yes | -mno-align-stringops
27.6 | Maybe | -fno-merge-constants (! -O1)
27.0 | Maybe | -falign-jumps (-O2 GCC 3.3)
26.8 | Yes | -falign-loops (-O2 GCC 3.3)
26.5 | Yes | -freorder-functions (-O2 GCC 3.3)
26.4 | No | -fstrict-aliasing (-O2)
26.2 | Maybe | -falign-labels (-O2 GCC 3.3)
26.0 | Maybe | -ffinite-math-only (fast math)
25.4 | Yes | -fno-delayed-branch (! -O1)
25.2 | Maybe | -fcse-skip-blocks (-O2)
24.5 | No | -funsafe-math-optimizations (fast math)
24.3 | Maybe | -mno-push-args
23.7 | Maybe | -mieee-fp
23.3 | Maybe | -malign-double
22.9 | Maybe | -fmove-all-movables
22.4 | Maybe | -frename-registers (-O3)
22.2 | Maybe | -fno-trapping-math (fast math)
22.1 | Maybe | -fcse-follow-jumps (-O2)
22.0 | Maybe | -fno-cprop-registers (! -O1)
21.9 | Maybe | -fno-if-conversion2 (! -O1)
21.4 | Maybe | -fno-defer-pop (! -O1)
21.3 | Maybe | -fno-thread-jumps (! -O1)
21.0 | No | -frerun-loop-opt (-O2)
20.9 | No | -fno-inline
20.7 | Maybe | -fstrength-reduce (-O2)
20.7 | No | -finline-functions (-O3)
20.6 | Maybe | -freorder-blocks (-O2)
20.6 | Maybe | -fsched-interblock (-O2 GCC 3.3)
19.7 | Maybe | -fno-crossjumping (! -O1)
19.3 | Maybe | -fno-math-errno (fast math)
19.2 | Maybe | -fregmove (-O2)
18.8 | Maybe | -maccumulate-outgoing-args
18.5 | No | -fprefetch-loop-arrays
18.0 | No | -fpeephole2 (-O2)
17.9 | Maybe | -finline-limit
17.7 | Maybe | -fdelete-null-pointer-checks (-O2)
17.6 | No | -fno-if-conversion (! -O1)
17.3 | No | -fgcse (-O2)
17.2 | Maybe | -fno-signaling-nans (fast math)
17.0 | No | -fexpensive-optimizations (-O2)
15.5 | Maybe | -foptimize-sibling-calls (-O2)
15.5 | No | -ftracer
15.2 | Maybe | -frerun-cse-after-loop (-O2)
14.0 | No | -fomit-frame-pointer
13.5 | No | -fschedule-insns2 (-O2)
12.9 | No | -freduce-all-givs
10.8 | No | -fnew-ra
10.7 | No | -fno-guess-branch-probability (! -O1)
9.1 | No | -fschedule-insns (-O2)
8.8 | No | -funroll-loops
7.7 | No | -funroll-all-loops
5.2 | No | -ffloat-store
0.0 | No | -fno-loop-optimize (! -O1)
0.0 | No | -mfpmath=387
0.0 | No | -mfpmath=sse
0.0 | No | -mfpmath=sse,387
0.0 | No | -momit-leaf-frame-pointer |
My question is ... Do I need to change my CFLAGS to -O2 from my current -O3 ? IOW .. Looking at these test results, my CFLAGS should read | Code: | | CFLAGS="-O2 -march=pentium4 -fno-omit-frame-pointer -fsched-spec -minline-all-stringops -fcaller-saves -mno-align-stringops -falign-loops -freorder-functions -fno-delayed-branch" |
Is this correct ? Also, I noticed that -pipe was not in the list.
Thanks for the input. And again, Thanks for making choosing CFLAGS easy ! |
|
| Back to top |
|
 |
lookitsme n00b


Joined: 06 Nov 2003 Posts: 35 Location: Kuala Lumpur, Malaysia
|
Posted: Fri Apr 16, 2004 7:05 am Post subject: Re: results |
|
|
| Robe wrote: | | [...]Is this correct ? Also, I noticed that -pipe was not in the list.[...] |
Looks correct to me... I always had O3 in my Cflags. After running acovea and the script from this thread I changed back to O2 plus all the yes options from the list. I did some tests on firefox, evolution and openoffice. All went well and I noticed an increase in speed.
The day before yesterday I messed up my system badly so I decided to give the new Cflags the ultimate test by recompiling everything with it. Untill now I'm guite happy with it.
And yes, you need to add -pipe to the list of cflags. fyi, here are mine:
| Code: | | CFLAGS="-Wall -pipe -O2 -mcpu=pentium4 -march=pentium4 -fno-defer-pop -fno-thread-jumps -finline-limit=600 -fno-omit-frame-pointer -mno-align-stringops -freduce-all-givs -fno-if-conversion2 -fPIC" |
|
|
| Back to top |
|
 |
ikaro Veteran


Joined: 14 Jul 2003 Posts: 2525 Location: Denmark
|
Posted: Fri Apr 16, 2004 7:20 am Post subject: |
|
|
All the options in O2 are in -O3 , so why do you change from -O3 to O2 + the yes options, instead of keeping -O3 and adding the Yes options + !- options, and maybe some of the maybe options ? _________________ linux: #232767 |
|
| Back to top |
|
 |
lookitsme n00b


Joined: 06 Nov 2003 Posts: 35 Location: Kuala Lumpur, Malaysia
|
Posted: Fri Apr 16, 2004 7:28 am Post subject: |
|
|
| ikaro wrote: | | All the options in O2 are in -O3 , so why do you change from -O3 to O2 + the yes options, instead of keeping -O3 and adding the Yes options + !- options, and maybe some of the maybe options ? |
Because most of the yes options out of my list are to switch off some of the O1 options. And O3 switches much more on then I have now. Plus: the results are pretty obvious. Using O3 my system is a lot slower and compile times are a lot longer. |
|
| Back to top |
|
 |
wilburpan l33t


Joined: 21 Jan 2003 Posts: 959
|
Posted: Fri Apr 16, 2004 11:17 am Post subject: |
|
|
| Hypnos wrote: | | wilburpan wrote: | | Is there any reason to believe that two different computers with the same CPU would generate different acovea results? |
Yes -- depends on your motherboard and RAM, too, I would think. |
Initially when I read this, I didn't understand the rationale, since it wasn't clear to me why a different motherboard and/or RAM would require different CFLAGS for full optimization. Then I realized that my original question wasn't clear, and that Hypnos might have been saying that the motherboard and RAM would affect the total time it would take to run the acovea scripts.
So just to clarify:
1. Is there any reason to believe that two different computers with the same CPU would generate different acovea results as far as suggested CFLAGS?
2. If so, why would a different motherboard and/or RAM result in different CFLAGS optimizations if the CPU is the same? _________________ I'm only hanging out in OTW until I get rid of this stupid l33t ranking.....Crap. That didn't work. |
|
| Back to top |
|
 |
Hypnos Veteran


Joined: 18 Jul 2002 Posts: 1880 Location: Bay Area/Tokyo
|
Posted: Fri Apr 16, 2004 5:46 pm Post subject: |
|
|
| wilburpan wrote: |
1. Is there any reason to believe that two different computers with the same CPU would generate different acovea results as far as suggested CFLAGS?
|
Maybe. If you have the exact same CPU (including cache size), your RAM/motherboard speed might an issue if a run is creating big fat binaries. _________________ ~ Lenovo Thinkpad T61 w/ GM965 -- (most) everything works!
~ Étoilé, a document- and project-driven desktop (in the GNUstep overlay) |
|
| Back to top |
|
 |
Robe n00b


Joined: 05 Jan 2004 Posts: 64
|
Posted: Fri Apr 16, 2004 6:41 pm Post subject: Excellent ! |
|
|
As posted above.. I used the CFLAGS all marrked yes as well as adding -Wall -pipe & fPIC (for prelinking). looks like this.
| Code: | | CFLAGS="-O2 -march=pentium4 -pipe -Wall -fno-omit-frame-pointer -fsched-spec -minline-all-stringops -fcaller-saves -mno-align-stringops -falign-loops -freorder-functions -fno-delayed-branch -fPIC" |
The amazing part is, I was never able to compile OO 1.1.1 with the usual CFLAGS (-O3 -pentium4 -pipe). But now, not only is my system noticeably faster, I can also compile OO. If I get any more excited my keys on my keyboard will stick together!
Many thanks to all for writing the excellent bash & perl scripts, as well as answering all posted questions. GENTOO ROCKS !!! |
|
| Back to top |
|
 |
Hypnos Veteran


Joined: 18 Jul 2002 Posts: 1880 Location: Bay Area/Tokyo
|
Posted: Fri Apr 16, 2004 10:20 pm Post subject: Re: Excellent ! |
|
|
| Robe wrote: | | As posted above.. I used the CFLAGS all marrked yes as well as adding -Wall -pipe & fPIC (for prelinking). looks like this. |
Don't use fPIC for everything -- applications/libs that need it will append the flag. _________________ ~ Lenovo Thinkpad T61 w/ GM965 -- (most) everything works!
~ Étoilé, a document- and project-driven desktop (in the GNUstep overlay) |
|
| Back to top |
|
 |
Robe n00b


Joined: 05 Jan 2004 Posts: 64
|
Posted: Fri Apr 16, 2004 11:01 pm Post subject: Re: Excellent ! |
|
|
| Hypnos wrote: | | Robe wrote: | | As posted above.. I used the CFLAGS all marrked yes as well as adding -Wall -pipe & fPIC (for prelinking). looks like this. |
Don't use fPIC for everything -- applications/libs that need it will append the flag. |
thanks Hypnos I was woundering about that switch. |
|
| Back to top |
|
 |
mollmerx n00b

Joined: 19 Dec 2003 Posts: 41 Location: Cambridge, UK
|
Posted: Sat Apr 17, 2004 4:20 am Post subject: |
|
|
Earlier on someone asked whether it was a good idea to run acovea from a terminal window in KDE. I would have thought that this would even be preferable to running in entirely on its own. After all, the programs that you are trying to find the optimal CFLAGS for will be running alongside XFree, KDE and all the rest later on. So you want to optimise for that kind of behaviour. And, as was also mentioned, genetic fitness is relative.
I wouldn't even worry too much about completing the odd task or other while running acovea. If something slows it down a little somewhere, it will get on track again somewhere else later on. The CFLAGS you end up with will be the same ones, the route acovea took to get there doesn't matter at all.
The thing I love about GA is that is so parallel to real evolution. More or less everything from Richard Dawkins' "The Blind Watchmaker" can be applied here.
Have fun optimising! I havn't got round to giving it a go yet - final exams coming up in just over week. After that I think I'll be gentooing day and night for a few months  |
|
| Back to top |
|
 |
wilburpan l33t


Joined: 21 Jan 2003 Posts: 959
|
Posted: Sat Apr 17, 2004 5:55 am Post subject: |
|
|
| Hypnos wrote: | | wilburpan wrote: |
1. Is there any reason to believe that two different computers with the same CPU would generate different acovea results as far as suggested CFLAGS?
|
Maybe. If you have the exact same CPU (including cache size), your RAM/motherboard speed might an issue if a run is creating big fat binaries. |
Thanks for clearing that up.
Here's the results from my a 750 MHz P3 desktop:
| Code: | Score | So? | Switch (annotation)
------------------------------------------------------------------------------
38.0 | Yes | -fgcse (-O2)
29.6 | Yes | -maccumulate-outgoing-args
29.5 | Yes | -finline-limit
29.5 | Yes | -mieee-fp
29.3 | Yes | -frename-registers (-O3)
28.8 | Yes | -falign-jumps (-O2 GCC 3.3)
28.7 | Yes | -fcse-skip-blocks (-O2)
28.5 | Yes | -fmove-all-movables
28.3 | Maybe | -freorder-blocks (-O2)
28.1 | Maybe | -fschedule-insns2 (-O2)
27.8 | Maybe | -finline-functions (-O3)
27.6 | Yes | -fno-merge-constants (! -O1)
27.4 | Maybe | -fno-math-errno (fast math)
27.2 | Maybe | -fno-if-conversion2 (! -O1)
27.2 | Yes | -fcse-follow-jumps (-O2)
27.2 | Yes | -fno-crossjumping (! -O1)
26.9 | Yes | -freorder-functions (-O2 GCC 3.3)
26.6 | Maybe | -falign-loops (-O2 GCC 3.3)
26.4 | Maybe | -frerun-cse-after-loop (-O2)
26.4 | Maybe | -fno-cprop-registers (! -O1)
26.2 | Yes | -fexpensive-optimizations (-O2)
26.2 | Yes | -mno-align-stringops
26.0 | Yes | -mno-push-args
25.9 | Yes | -minline-all-stringops
25.7 | Maybe | -fno-omit-frame-pointer (! -O1)
25.6 | Maybe | -funsafe-math-optimizations (fast math)
25.4 | No | -fstrict-aliasing (-O2)
25.1 | Yes | -malign-double
24.7 | Maybe | -frerun-loop-opt (-O2)
24.0 | Yes | -fdelete-null-pointer-checks (-O2)
24.0 | Yes | -ffinite-math-only (fast math)
24.0 | Maybe | -fcaller-saves (-O2)
23.6 | Yes | -fno-delayed-branch (! -O1)
23.3 | Maybe | -fno-defer-pop (! -O1)
23.3 | Maybe | -fsched-spec (-O2 GCC 3.3)
22.9 | Yes | -fsched-interblock (-O2 GCC 3.3)
22.8 | Maybe | -fno-trapping-math (fast math)
22.7 | Yes | -fstrength-reduce (-O2)
22.1 | Maybe | -fpeephole2 (-O2)
22.0 | Maybe | -falign-labels (-O2 GCC 3.3)
21.8 | Yes | -foptimize-sibling-calls (-O2)
20.8 | Maybe | -fno-thread-jumps (! -O1)
20.7 | Maybe | -ftracer
17.3 | No | -fno-signaling-nans (fast math)
17.1 | No | -fforce-mem (-O2)
16.8 | Maybe | -fprefetch-loop-arrays
15.2 | No | -funroll-all-loops
14.5 | No | -fregmove (-O2)
13.9 | No | -fno-if-conversion (! -O1)
13.3 | No | -freduce-all-givs
12.7 | No | -fno-guess-branch-probability (! -O1)
10.3 | No | -ffloat-store
10.2 | No | -fschedule-insns (-O2)
10.1 | No | -funroll-loops
10.0 | No | -fno-inline
9.6 | No | -fomit-frame-pointer
9.2 | No | -fnew-ra
0.0 | No | -fno-loop-optimize (! -O1)
0.0 | No | -mfpmath=387
0.0 | No | -mfpmath=sse
0.0 | No | -mfpmath=sse,387
0.0 | No | -momit-leaf-frame-pointer |
So if I understand this correctly, I should:
1. use -O2 since that seems to encompass most of the "Yes" options above,
2. add in the other "Yes" options, -pipe, and -Wall
3. not use the fast-math optimizations.
So this should be what I put in /etc/make.conf:
| Code: | | CFLAGS="-march=pentium3 -O2 -maccumulate-outgoing-args -finline-limit -mieee-fp -frename-registers -fmove-all-movables -fno-merge-constants -fno-crossjumping -mno-align-stringops -mno-push-args -minline-all-stringops -malign-double -fno-delayed-branch -Wall -pipe" |
Correct? _________________ I'm only hanging out in OTW until I get rid of this stupid l33t ranking.....Crap. That didn't work. |
|
| Back to top |
|
 |
Hypnos Veteran


Joined: 18 Jul 2002 Posts: 1880 Location: Bay Area/Tokyo
|
Posted: Sat Apr 17, 2004 6:48 am Post subject: |
|
|
| wilburpan wrote: | So if I understand this correctly, I should:
1. use -O2 since that seems to encompass most of the "Yes" options above,
2. add in the other "Yes" options, -pipe, and -Wall
3. not use the fast-math optimizations.
So this should be what I put in /etc/make.conf:
| Code: | | CFLAGS="-march=pentium3 -O2 -maccumulate-outgoing-args -finline-limit -mieee-fp -frename-registers -fmove-all-movables -fno-merge-constants -fno-crossjumping -mno-align-stringops -mno-push-args -minline-all-stringops -malign-double -fno-delayed-branch -Wall -pipe" |
Correct? |
I would recommend against "-mieee-fp" since some programs are tailored to "wrong" math.
I would strongly recommend against "-malign-double" since it breaks binary compatibility with binaries built with the default word alignment. _________________ ~ Lenovo Thinkpad T61 w/ GM965 -- (most) everything works!
~ Étoilé, a document- and project-driven desktop (in the GNUstep overlay) |
|
| Back to top |
|
 |
Robe n00b


Joined: 05 Jan 2004 Posts: 64
|
Posted: Sat Apr 17, 2004 7:06 am Post subject: |
|
|
| Hypnos wrote: | | wilburpan wrote: | So if I understand this correctly, I should:
1. use -O2 since that seems to encompass most of the "Yes" options above,
2. add in the other "Yes" options, -pipe, and -Wall
3. not use the fast-math optimizations.
So this should be what I put in /etc/make.conf:
| Code: | | CFLAGS="-march=pentium3 -O2 -maccumulate-outgoing-args -finline-limit -mieee-fp -frename-registers -fmove-all-movables -fno-merge-constants -fno-crossjumping -mno-align-stringops -mno-push-args -minline-all-stringops -malign-double -fno-delayed-branch -Wall -pipe" |
Correct? |
I would recommend against "-mieee-fp" since some programs are tailored to "wrong" math.
I would strongly recommend against "-malign-double" since it breaks binary compatibility with binaries built with the default word alignment. |
Agreed.. -mieeefp breaks Nvidia kernel drivers. |
|
| Back to top |
|
 |
wilburpan l33t


Joined: 21 Jan 2003 Posts: 959
|
Posted: Sat Apr 17, 2004 7:11 am Post subject: |
|
|
Thanks for the input.
Three more questions:
1. Is there a list of options that can be generated by acovea that should not be used because of compatibility issues, etc.?
2. How do you pick a good number for -finline-limit?
3. With my results, -finline-functions is a "maybe", so i diddn't include it in my list. But if I'm not using -finline-functions, is -finline-limit of any use? _________________ I'm only hanging out in OTW until I get rid of this stupid l33t ranking.....Crap. That didn't work. |
|
| Back to top |
|
 |
karthik1024 n00b


Joined: 11 Mar 2004 Posts: 10 Location: Stanford, CA
|
Posted: Sat Apr 17, 2004 11:58 am Post subject: |
|
|
| ikaro wrote: | almabench results for Amd Athlon-XP:
| Code: |
optimistic options:
-fno-cprop-registers (1.135)
-fcse-skip-blocks (1.181)
-fschedule-insns (2.324)
-freorder-blocks (1.455)
-finline-functions (1.821)
-frename-registers (1.272)
-fprefetch-loop-arrays (1.363)
-mieee-fp (1.683)
-funsafe-math-optimizations (2.461)
pessimistic options:
-fno-loop-optimize (-1.288)
-frerun-loop-opt (-1.197)
-ffloat-store (-1.928)
-fno-inline (-1.791)
-fnew-ra (-2.065)
-funroll-all-loops (-1.242)
-mfpmath=387 (-1.425)
-mfpmath=sse (-1.563)
-mfpmath=sse,387 (-1.7)
-fomit-frame-pointer (-1.334)
-fno-math-errno (-1.06)
|
still running .. going to take the night over ..
 |
Does this mean the I can use the "optimistic" flags and drop the "pessimistic" CFLAGS to get good performance ?
Karthik
AMD Athlon-xp 1700+ |
|
| Back to top |
|
 |
Hypnos Veteran


Joined: 18 Jul 2002 Posts: 1880 Location: Bay Area/Tokyo
|
Posted: Sat Apr 17, 2004 2:29 pm Post subject: |
|
|
| wilburpan wrote: | | 1. Is there a list of options that can be generated by acovea that should not be used because of compatibility issues, etc.? |
No; I was too lazy to include those annotations in my aggregate reporting script.
| Quote: | | 2. How do you pick a good number for -finline-limit? |
Acovea seems to like the default, 600.
| Quote: | | 3. With my results, -finline-functions is a "maybe", so i diddn't include it in my list. But if I'm not using -finline-functions, is -finline-limit of any use? |
Hmmm, I don't know ... probably not. _________________ ~ Lenovo Thinkpad T61 w/ GM965 -- (most) everything works!
~ Étoilé, a document- and project-driven desktop (in the GNUstep overlay) |
|
| Back to top |
|
 |
|