Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Acovea-4.0.0 : Try out my ebuilds (and scripts)
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3, 4 ... 14, 15, 16  Next  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
Angel666
n00b
n00b


Joined: 14 Nov 2003
Posts: 45
Location: Palo Alto, CA

PostPosted: Sun Apr 11, 2004 6:25 am    Post subject: Reply with quote

First of all, i would like to say thanks to Hypnos for actually writing this brilliant little script!

Now, my results:

Code:
 Score |  So?  | Switch (annotation)
------------------------------------------------------------------------------
  35.0 |  Yes  | -freorder-blocks (-O2)
  31.8 |  Yes  | -falign-loops (-O2 GCC 3.3)
  31.5 |  Yes  | -fdelete-null-pointer-checks (-O2)
  31.3 |  Yes  | -fno-if-conversion2 (! -O1)
  31.3 | Maybe | -finline-functions (-O3)
  30.4 |  Yes  | -malign-double
  29.5 |  Yes  | -funit-at-a-time
  29.5 |  Yes  | -fno-trapping-math (fast math)
  29.2 |  Yes  | -fpeephole2 (-O2)
  29.1 | Maybe | -fexpensive-optimizations (-O2)
  28.6 |  Yes  | -ffinite-math-only (fast math)
  28.4 | Maybe | -fpeel-loops
  28.2 | Maybe | -fforce-mem (-O2)
  28.0 |  Yes  | -mno-push-args
  27.7 | Maybe | -fsched-interblock (-O2 GCC 3.3)
  27.6 |  Yes  | -fregmove (-O2)
  27.6 | Maybe | -fgcse (-O2)
  27.5 | Maybe | -funsafe-math-optimizations (fast math)
  27.2 |  Yes  | -maccumulate-outgoing-args
  27.2 |  Yes  | -falign-functions
  27.1 |  Yes  | -finline-limit
  26.8 |  Yes  | -fno-crossjumping (! -O1)
  26.7 |  Yes  | -fsched-spec (-O2 GCC 3.3)
  26.5 |  Yes  | -falign-jumps (-O2 GCC 3.3)
  26.5 | Maybe | -fstrict-aliasing (-O2)
  26.4 |  Yes  | -fcse-skip-blocks (-O2)
  26.0 |  Yes  | -fno-signaling-nans (fast math)
  25.8 |  Yes  | -fno-omit-frame-pointer (! -O1)
  25.5 |  Yes  | -minline-all-stringops
  25.4 |  Yes  | -ftracer
  25.1 |  Yes  | -funswitch-loops
  24.9 | Maybe | -fweb
  24.7 | Maybe | -frerun-loop-opt (-O2)
  24.6 | Maybe | -fcaller-saves (-O2)
  24.5 | Maybe | -falign-labels (-O2 GCC 3.3)
  24.4 | Maybe | -frerun-cse-after-loop (-O2)
  24.4 |  Yes  | -fmove-all-movables
  24.2 | Maybe | -fcse-follow-jumps (-O2)
  24.2 | Maybe | -mieee-fp
  23.9 |  Yes  | -fno-defer-pop (! -O1)
  23.8 |  Yes  | -frename-registers (-O3)
  23.5 | Maybe | -fno-thread-jumps (! -O1)
  23.3 | Maybe | -fno-cprop-registers (! -O1)
  22.9 |  Yes  | -freorder-functions (-O2 GCC 3.3)
  22.7 |  Yes  | -fno-delayed-branch (! -O1)
  22.1 |  Yes  | -freduce-all-givs
  21.8 | Maybe | -fstrength-reduce (-O2)
  21.0 | Maybe | -fno-math-errno (fast math)
  20.3 |   No  | -fschedule-insns2 (-O2)
  19.5 | Maybe | -mno-align-stringops
  18.6 | Maybe | -foptimize-sibling-calls (-O2)
  18.1 | Maybe | -fno-merge-constants (! -O1)
  17.3 |   No  | -fno-if-conversion (! -O1)
  16.9 | Maybe | -fbranch-target-load-optimize
  15.8 |   No  | -fprefetch-loop-arrays
  14.9 |   No  | -fschedule-insns (-O2)
  11.1 |   No  | -fnew-ra
  10.8 |   No  | -fno-loop-optimize (! -O1)
   9.7 |   No  | -ffloat-store
   9.4 |   No  | -fomit-frame-pointer
   8.5 |   No  | -funroll-loops
   8.0 |   No  | -momit-leaf-frame-pointer
   6.7 |   No  | -fno-guess-branch-probability (! -O1)
   6.6 |   No  | -funroll-all-loops
   6.1 |   No  | -fno-inline
   0.0 |   No  | -fbranch-target-load-optimize2
   0.0 |   No  | -mfpmath=387
   0.0 |   No  | -mfpmath=sse
   0.0 |   No  | -mfpmath=sse,387


And my new CFLAGS (chosen with -O2 already used and only "Yes" options):

Code:
CFLAGS="-O2 -Wall -pipe -march=pentium4 -fno-if-conversion2 -finline-functions -funit-at-a-time -fpeel-loops -mno-push-args -maccumulate-outgoing-args -falign-functions -finline-limit=600 -fno-crossjumping -fno-omit-frame-pointer -minline-all-stringops -ftracer -funswitch-loops -fmove-all-movables -fno-defer-pop -frename-registers -fno-delayed-branch -freduce-all-givs"


I have to say, that is much better than it was before! Just a few notes - the link that you gave me, for optimizations, is for GCC 3.3, so it does not contain options like -fnew-ra, which are exactly the ones that I am interested in. If any documentation is availible, then it would be helpful.

Maybe i have a different ideal for this than anyone else, but here is my idea: (i wish i knew perl/python :wink: ) Some options are very good for all benchmarks, or are at least positive for all of them. And some flags are good and bad for different operations. In my opinion, these should somehow be separated, for example additional options like "Stable" and "Unstable" so the user would know if that particular flag was good across the board or very specific. Maybe it would be redundant, but it would certainly make things simpler (in my opinion) if someone wanted to just choose optimistic options without having to compromize any programs, or for the person who wants full optimization power at the expense of a few slow apps.
_________________
"One World, One web, One program" - Microsoft Promo ad.
"Ein Volk, Ein Reich, Ein Fuhrer" - Adolf Hitler
Back to top
View user's profile Send private message
aethyr
Veteran
Veteran


Joined: 06 Apr 2003
Posts: 1085
Location: NYC

PostPosted: Sun Apr 11, 2004 7:07 am    Post subject: Reply with quote

Angel666 wrote:
the link that you gave me, for optimizations, is for GCC 3.3, so it does not contain options like -fnew-ra, which are exactly the ones that I am interested in. If any documentation is availible, then it would be helpful.

I don't know if you're talking about the link I provided:
http://gcc.gnu.org/onlinedocs/gcc-3.3/gcc/Optimize-Options.html
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
AFAIK -fnew-ra is a 3.3 flag. As found on those provided links:
Code:
-fnew-ra
    Use a graph coloring register allocator. Currently this option is meant for testing, so we are interested to hear about miscompilations with -fnew-ra.
-ftracer
    Perform tail duplication to enlarge superblock size. This transformation simplifies the control flow of the function allowing other optimizations to do better job.
Back to top
View user's profile Send private message
Angel666
n00b
n00b


Joined: 14 Nov 2003
Posts: 45
Location: Palo Alto, CA

PostPosted: Sun Apr 11, 2004 7:28 am    Post subject: Reply with quote

Ooops... sorry. I don't know how i missed that, i swear i searched both those pages for those flags... :(
_________________
"One World, One web, One program" - Microsoft Promo ad.
"Ein Volk, Ein Reich, Ein Fuhrer" - Adolf Hitler
Back to top
View user's profile Send private message
Hypnos
Advocate
Advocate


Joined: 18 Jul 2002
Posts: 2889
Location: Omnipresent

PostPosted: Sun Apr 11, 2004 8:08 am    Post subject: Reply with quote

Angel666 wrote:
Maybe i have a different ideal for this than anyone else, but here is my idea: (i wish i knew perl/python :wink: ) Some options are very good for all benchmarks, or are at least positive for all of them.

Yes, they have high scores and are recommended based on statistical confidence.

Quote:
And some flags are good and bad for different operations.


The opposite.

Quote:
In my opinion, these should somehow be separated, for example additional options like "Stable" and "Unstable" so the user would know if that particular flag was good across the board or very specific. Maybe it would be redundant, but it would certainly make things simpler (in my opinion) if someone wanted to just choose optimistic options without having to compromize any programs, or for the person who wants full optimization power at the expense of a few slow apps.

"Score" tells you how well you make out in the end after compiling a running a gazillion apps. "So?" is a recommendation of whether or not it's worth it, given the volatility in the positive performance of the flag.

I think my script does what you describe, but perhaps I misunderstand ...
_________________
Personal overlay | Simple backup scheme
Back to top
View user's profile Send private message
ikaro
Advocate
Advocate


Joined: 14 Jul 2003
Posts: 2527
Location: Denmark

PostPosted: Sun Apr 11, 2004 8:11 am    Post subject: Reply with quote

Code:

 obsolete



should i buy another cpu ?
_________________
linux: #232767


Last edited by ikaro on Fri May 14, 2004 9:05 pm; edited 1 time in total
Back to top
View user's profile Send private message
Hypnos
Advocate
Advocate


Joined: 18 Jul 2002
Posts: 2889
Location: Omnipresent

PostPosted: Sun Apr 11, 2004 8:15 am    Post subject: Reply with quote

ikaro wrote:
should i buy another cpu ?

Goodness, no! :) All this says is the gcc isn't quite sure what to do with your CPU, but that doesn't mean it's slow, even in comparison to optimized code running on Intel CPU.

I think the only combination that would win hands down is icc+Intel.

----

Perhaps it's placebo, but after recompiling Mozilla and X.org-X11 with my custom CFLAGS forced, my desktop is a bit snappier. We'll see how stable it runs ...
_________________
Personal overlay | Simple backup scheme
Back to top
View user's profile Send private message
ikaro
Advocate
Advocate


Joined: 14 Jul 2003
Posts: 2527
Location: Denmark

PostPosted: Sun Apr 11, 2004 8:53 am    Post subject: Reply with quote

Edited after recomendation :

Code:

CFLAGS="
-march=athlon-xp
-O3
-pipe
-fno-cprop-registers
-fno-thread-jumps
-fno-defer-pop
-maccumulate-outgoing-args
-fno-if-conversion2
-fno-delayed-branch
-fno-crossjumping
-fno-merge-constants
-fno-omit-frame-pointer
-ftracer
-finline-limit=600
-minline-all-stringops
-mno-push-args
-fmove-all-movables
-mno-align-stringops"

_________________
linux: #232767


Last edited by ikaro on Sun Apr 11, 2004 12:46 pm; edited 3 times in total
Back to top
View user's profile Send private message
Hypnos
Advocate
Advocate


Joined: 18 Jul 2002
Posts: 2889
Location: Omnipresent

PostPosted: Sun Apr 11, 2004 9:07 am    Post subject: Reply with quote

ikaro wrote:
so how does this looks like ?

Code:

CFLAGS="
-march=athlon-xp
-O3
-pipe
-maccumulate-outgoing-args
-ftracer
-finline-limit=600
-fno-cprop-registers
-fno-crossjumping
-fno-thread-jumps
-fno-defer-pop 
-fno-signaling-nans
-fno-if-conversion2
-fno-omit-frame-pointer
-minline-all-stringops
-mno-push-args
-fmove-all-movables
-malign-double"


Thanks.


It generally looks fine; but:

* I don't understand your logic for choosing the "maybe" switches, but it probably doesn't matter much (that's why they're "maybe"'s) ...

* I recommend against "-malign-double" (which is considered unsafe) and "-fno-signaling-nans" which can confuse programs which rely on standards-compliant floating point math.
_________________
Personal overlay | Simple backup scheme
Back to top
View user's profile Send private message
ikaro
Advocate
Advocate


Joined: 14 Jul 2003
Posts: 2527
Location: Denmark

PostPosted: Sun Apr 11, 2004 9:38 am    Post subject: Reply with quote

I will take your advice and update the cflags. im only posting the results and let people decide whats best.
Thanks again.
_________________
linux: #232767


Last edited by ikaro on Wed Apr 14, 2004 4:22 pm; edited 1 time in total
Back to top
View user's profile Send private message
Hypnos
Advocate
Advocate


Joined: 18 Jul 2002
Posts: 2889
Location: Omnipresent

PostPosted: Sun Apr 11, 2004 10:10 am    Post subject: Reply with quote

I am pleased -- my speed boost is not placebo: Q3A is playable now at max settings, and UT2004 is more responsive (texture loading is still an obvious problem with my OSS Radeon driver).

Overall, it's comparable to the boost you get moving from 2.4->2.6 (I'm stuck on 2.4 since suspend-to-disk doesn't work for me on 2.6).
_________________
Personal overlay | Simple backup scheme
Back to top
View user's profile Send private message
lookitsme
n00b
n00b


Joined: 06 Nov 2003
Posts: 48
Location: Kuala Lumpur, Malaysia

PostPosted: Mon Apr 12, 2004 12:06 pm    Post subject: Reply with quote

I'm a bit confused here. I'm running the latest version of the run script at the moment. Its finished the 1st 4 test already. And if I run the digest perl script it gives me all no's:

Code:

 Score |  So?  | Switch (annotation)
------------------------------------------------------------------------------
  14.6 |   No  | -fno-thread-jumps (! -O1)
  14.3 |   No  | -fschedule-insns2 (-O2)
  13.6 |   No  | -frerun-loop-opt (-O2)
  12.6 |   No  | -finline-limit
  12.5 |   No  | -minline-all-stringops
  12.5 |   No  | -funsafe-math-optimizations (fast math)
  12.2 |   No  | -fsched-spec (-O2 GCC 3.3)
  12.0 |   No  | -fdelete-null-pointer-checks (-O2)
  11.9 |   No  | -fcaller-saves (-O2)
  11.7 |   No  | -mieee-fp
  11.6 |   No  | -falign-labels (-O2 GCC 3.3)
  11.5 |   No  | -fno-omit-frame-pointer (! -O1)
  11.3 |   No  | -falign-jumps (-O2 GCC 3.3)
  11.1 |   No  | -fpeephole2 (-O2)
  11.1 |   No  | -mno-align-stringops
  10.9 |   No  | -falign-loops (-O2 GCC 3.3)
  10.7 |   No  | -fno-defer-pop (! -O1)
  10.6 |   No  | -frename-registers (-O3)
  10.5 |   No  | -mno-push-args
  10.3 |   No  | -fno-if-conversion2 (! -O1)
  10.2 |   No  | -frerun-cse-after-loop (-O2)
  10.2 |   No  | -fforce-mem (-O2)
  10.0 |   No  | -fexpensive-optimizations (-O2)
   9.9 |   No  | -fno-delayed-branch (! -O1)
   9.7 |   No  | -freorder-functions (-O2 GCC 3.3)
   9.6 |   No  | -fcse-skip-blocks (-O2)
   9.5 |   No  | -malign-double
   9.3 |   No  | -fno-merge-constants (! -O1)
   9.1 |   No  | -fno-signaling-nans (fast math)
   9.1 |   No  | -fno-loop-optimize (! -O1)
   9.1 |   No  | -fsched-interblock (-O2 GCC 3.3)
   8.9 |   No  | -freduce-all-givs
   8.9 |   No  | -fmove-all-movables
   8.7 |   No  | -foptimize-sibling-calls (-O2)
   8.5 |   No  | -fcse-follow-jumps (-O2)
   8.3 |   No  | -fgcse (-O2)
   8.3 |   No  | -finline-functions (-O3)
   8.1 |   No  | -fstrength-reduce (-O2)
   7.5 |   No  | -ffinite-math-only (fast math)
   7.5 |   No  | -freorder-blocks (-O2)
   7.2 |   No  | -ftracer
   7.2 |   No  | -fprefetch-loop-arrays
   6.9 |   No  | -fno-crossjumping (! -O1)
   6.1 |   No  | -fno-math-errno (fast math)
   6.0 |   No  | -fregmove (-O2)
   5.4 |   No  | -fstrict-aliasing (-O2)
   5.4 |   No  | -fno-cprop-registers (! -O1)
   5.1 |   No  | -fno-if-conversion (! -O1)
   5.0 |   No  | -fno-inline
   4.6 |   No  | -fschedule-insns (-O2)
   0.0 |   No  | -fno-guess-branch-probability (! -O1)
   0.0 |   No  | -ffloat-store
   0.0 |   No  | -fnew-ra
   0.0 |   No  | -funroll-loops
   0.0 |   No  | -funroll-all-loops
   0.0 |   No  | -maccumulate-outgoing-args
   0.0 |   No  | -mfpmath=387
   0.0 |   No  | -mfpmath=sse
   0.0 |   No  | -mfpmath=sse,387
   0.0 |   No  | -fomit-frame-pointer
   0.0 |   No  | -momit-leaf-frame-pointer
   0.0 |   No  | -fno-trapping-math (fast math)


Am I just impatient and should I wait for all 7 tests to finish???
Back to top
View user's profile Send private message
Hypnos
Advocate
Advocate


Joined: 18 Jul 2002
Posts: 2889
Location: Omnipresent

PostPosted: Mon Apr 12, 2004 1:23 pm    Post subject: Reply with quote

lookitsme wrote:
Am I just impatient and should I wait for all 7 tests to finish???

Yes. What's happening in all likelihood is that the standard deviation is too big for the means you're getting.
_________________
Personal overlay | Simple backup scheme
Back to top
View user's profile Send private message
aethyr
Veteran
Veteran


Joined: 06 Apr 2003
Posts: 1085
Location: NYC

PostPosted: Mon Apr 12, 2004 3:17 pm    Post subject: Reply with quote

Results! 850Mhz Celeron2 (Coppermine) gcc 3.3:
Code:

 Score |  So?  | Switch (annotation)
------------------------------------------------------------------------------
  31.0 | Maybe | -fgcse (-O2)
  30.1 |  Yes  | -frerun-cse-after-loop (-O2)
  29.9 | Maybe | -freorder-functions (-O2 GCC 3.3)
  29.8 |  Yes  | -finline-functions (-O3)
  29.8 |  Yes  | -mno-align-stringops
  29.4 |  Yes  | -fmove-all-movables
  29.1 | Maybe | -fforce-mem (-O2)
  28.4 | Maybe | -mieee-fp
  28.3 |  Yes  | -fno-defer-pop (! -O1)
  28.0 |  Yes  | -maccumulate-outgoing-args
  28.0 |  Yes  | -fno-cprop-registers (! -O1)
  27.8 |  Yes  | -falign-labels (-O2 GCC 3.3)
  27.5 |  Yes  | -finline-limit
  27.1 | Maybe | -fschedule-insns2 (-O2)
  26.9 |  Yes  | -fcse-follow-jumps (-O2)
  26.9 |  Yes  | -fsched-interblock (-O2 GCC 3.3)
  26.6 |  Yes  | -foptimize-sibling-calls (-O2)
  26.6 |  Yes  | -fno-math-errno (fast math)
  26.4 |  Yes  | -fno-merge-constants (! -O1)
  26.3 | Maybe | -fno-trapping-math (fast math)
  26.1 |  Yes  | -fcaller-saves (-O2)
  25.9 |  Yes  | -fsched-spec (-O2 GCC 3.3)
  25.8 |  Yes  | -fprefetch-loop-arrays
  25.7 |  Yes  | -fdelete-null-pointer-checks (-O2)
  25.3 |  Yes  | -ffinite-math-only (fast math)
  24.8 |  Yes  | -fno-thread-jumps (! -O1)
  24.7 |  Yes  | -fno-crossjumping (! -O1)
  24.7 |  Yes  | -fno-signaling-nans (fast math)
  24.5 |   No  | -fstrict-aliasing (-O2)
  24.3 | Maybe | -fno-if-conversion2 (! -O1)
  24.1 | Maybe | -minline-all-stringops
  23.7 | Maybe | -frerun-loop-opt (-O2)
  23.6 |  Yes  | -fcse-skip-blocks (-O2)
  23.3 | Maybe | -ftracer
  23.2 | Maybe | -funsafe-math-optimizations (fast math)
  23.0 | Maybe | -frename-registers (-O3)
  22.7 |  Yes  | -malign-double
  22.7 | Maybe | -mno-push-args
  21.7 | Maybe | -falign-loops (-O2 GCC 3.3)
  21.4 | Maybe | -freorder-blocks (-O2)
  20.5 | Maybe | -fexpensive-optimizations (-O2)
  20.2 | Maybe | -fpeephole2 (-O2)
  19.8 | Maybe | -fstrength-reduce (-O2)
  19.6 | Maybe | -falign-jumps (-O2 GCC 3.3)
  19.4 | Maybe | -fno-omit-frame-pointer (! -O1)
  19.4 | Maybe | -fno-delayed-branch (! -O1)
  19.2 | Maybe | -freduce-all-givs
  17.3 |   No  | -fno-inline
  13.8 |   No  | -funroll-all-loops
  13.2 |   No  | -fno-if-conversion (! -O1)
  12.3 |   No  | -fregmove (-O2)
  11.7 |   No  | -fomit-frame-pointer
  11.5 |   No  | -fnew-ra
  11.2 |   No  | -fno-guess-branch-probability (! -O1)
   9.3 |   No  | -fschedule-insns (-O2)
   8.4 |   No  | -ffloat-store
   6.9 |   No  | -fno-loop-optimize (! -O1)
   0.0 |   No  | -funroll-loops
   0.0 |   No  | -mfpmath=387
   0.0 |   No  | -mfpmath=sse
   0.0 |   No  | -mfpmath=sse,387
   0.0 |   No  | -momit-leaf-frame-pointer
Back to top
View user's profile Send private message
lookitsme
n00b
n00b


Joined: 06 Nov 2003
Posts: 48
Location: Kuala Lumpur, Malaysia

PostPosted: Tue Apr 13, 2004 5:56 am    Post subject: Reply with quote

Hypnos wrote:
lookitsme wrote:
Am I just impatient and should I wait for all 7 tests to finish???

Yes. What's happening in all likelihood is that the standard deviation is too big for the means you're getting.


Thanks Hypnos, I showed a little more patience and the last test is just finished. Results for my Intel Pentium 4 with GCC 3.3:

Code:

 Score |  So?  | Switch (annotation)
------------------------------------------------------------------------------
  32.6 | Maybe | -fschedule-insns2 (-O2)
  31.5 |  Yes  | -fno-defer-pop (! -O1)
  31.0 |  Yes  | -fno-thread-jumps (! -O1)
  30.8 |  Yes  | -falign-jumps (-O2 GCC 3.3)
  30.8 |  Yes  | -fsched-spec (-O2 GCC 3.3)
  29.9 |  Yes  | -fcaller-saves (-O2)
  29.3 |  Yes  | -fpeephole2 (-O2)
  29.2 |  Yes  | -finline-limit
  29.1 |  Yes  | -falign-loops (-O2 GCC 3.3)
  28.5 |  Yes  | -fdelete-null-pointer-checks (-O2)
  28.3 | Maybe | -fcse-skip-blocks (-O2)
  27.9 |  Yes  | -frerun-loop-opt (-O2)
  27.9 |  Yes  | -frerun-cse-after-loop (-O2)
  27.7 |  Yes  | -fno-omit-frame-pointer (! -O1)
  27.7 | Maybe | -funsafe-math-optimizations (fast math)
  27.5 |  Yes  | -mno-align-stringops
  27.3 |  Yes  | -freorder-functions (-O2 GCC 3.3)
  27.2 |  Yes  | -malign-double
  26.6 | Maybe | -fforce-mem (-O2)
  26.6 | Maybe | -fmove-all-movables
  26.4 | Maybe | -frename-registers (-O3)
  26.3 | Maybe | -fcse-follow-jumps (-O2)
  25.8 | Maybe | -foptimize-sibling-calls (-O2)
  24.3 | Maybe | -mno-push-args
  24.3 | Maybe | -finline-functions (-O3)
  24.3 | Maybe | -fno-signaling-nans (fast math)
  24.1 | Maybe | -minline-all-stringops
  24.1 |  Yes  | -freduce-all-givs
  24.0 | Maybe | -fsched-interblock (-O2 GCC 3.3)
  23.9 |  Yes  | -fno-if-conversion2 (! -O1)
  23.2 | Maybe | -mieee-fp
  22.8 | Maybe | -ftracer
  22.7 | Maybe | -fno-merge-constants (! -O1)
  22.6 |  Yes  | -freorder-blocks (-O2)
  22.5 | Maybe | -ffinite-math-only (fast math)
  22.0 | Maybe | -falign-labels (-O2 GCC 3.3)
  21.9 | Maybe | -fno-crossjumping (! -O1)
  21.3 | Maybe | -fstrength-reduce (-O2)
  21.3 | Maybe | -fno-math-errno (fast math)
  20.7 | Maybe | -maccumulate-outgoing-args
  20.0 | Maybe | -fno-delayed-branch (! -O1)
  20.0 |   No  | -fexpensive-optimizations (-O2)
  18.7 |   No  | -fgcse (-O2)
  17.8 | Maybe | -fno-cprop-registers (! -O1)
  17.3 |   No  | -fregmove (-O2)
  16.9 | Maybe | -fprefetch-loop-arrays
  16.8 |   No  | -fno-if-conversion (! -O1)
  16.8 |   No  | -fstrict-aliasing (-O2)
  15.2 | Maybe | -fno-trapping-math (fast math)
  13.9 |   No  | -fschedule-insns (-O2)
  13.3 |   No  | -fomit-frame-pointer
  11.7 |   No  | -fno-guess-branch-probability (! -O1)
  11.2 |   No  | -fno-loop-optimize (! -O1)
  10.8 | Maybe | -momit-leaf-frame-pointer
  10.6 |   No  | -fno-inline
   7.8 |   No  | -ffloat-store
   7.4 |   No  | -fnew-ra
   7.1 |   No  | -funroll-loops
   6.7 |   No  | -mfpmath=387
   0.0 |   No  | -funroll-all-loops
   0.0 |   No  | -mfpmath=sse
   0.0 |   No  | -mfpmath=sse,387


I'm recompiling some programs at the moment (Mozilla, Evolution and what else I can find) with O2 and all Yes options from the results. But since I'm at work accessing my machine over SSH I cant tell the difference until tonight. Although I'm already under the impression that the compiling process itself speeded up.
Back to top
View user's profile Send private message
nmcsween
Guru
Guru


Joined: 12 Nov 2003
Posts: 381

PostPosted: Tue Apr 13, 2004 11:02 am    Post subject: Reply with quote

Can someone tell me where exactly the scripts go? and what they should be called? and also how to run them?
Back to top
View user's profile Send private message
aethyr
Veteran
Veteran


Joined: 06 Apr 2003
Posts: 1085
Location: NYC

PostPosted: Tue Apr 13, 2004 1:13 pm    Post subject: Reply with quote

Ultraoctane.com wrote:
Can someone tell me where exactly the scripts go? and what they should be called? and also how to run them?


Scripts should probably go in the same directory that you generate your acovea data from, they can be called anything (just copy and paste them from this topic into a file). You can run them by the following command:
Code:

perl scriptname
Back to top
View user's profile Send private message
Hypnos
Advocate
Advocate


Joined: 18 Jul 2002
Posts: 2889
Location: Omnipresent

PostPosted: Tue Apr 13, 2004 2:30 pm    Post subject: Reply with quote

lookitsme wrote:
[...] Although I'm already under the impression that the compiling process itself speeded up.

Yes -- turning off a lot of the -O1 stuff seems to be a real time saver.
_________________
Personal overlay | Simple backup scheme
Back to top
View user's profile Send private message
nmcsween
Guru
Guru


Joined: 12 Nov 2003
Posts: 381

PostPosted: Tue Apr 13, 2004 3:33 pm    Post subject: Reply with quote

Nice thread hypnos. Now im wondering is there a database of acovea tests anywhere? If not why don't we take the initiative? I could supply a few sql databases and about 1GB of webspace and also some php work.
Back to top
View user's profile Send private message
Hypnos
Advocate
Advocate


Joined: 18 Jul 2002
Posts: 2889
Location: Omnipresent

PostPosted: Tue Apr 13, 2004 5:57 pm    Post subject: Reply with quote

Ultraoctane.com wrote:
Nice thread hypnos. Now im wondering is there a database of acovea tests anywhere?

Not that I know of.

Quote:
If not why don't we take the initiative? I could supply a few sql databases and about 1GB of webspace and also some php work.

What do you have in mind exactly? What my script does is find the best "average" CFLAGS across a world-representative set of benchmarks for a given CPU -- that could all fit on one table, albeit a large one, but a database might be overkill ...

EDIT: I guess D>2, since you have CPU, switches and compiler version.
_________________
Personal overlay | Simple backup scheme
Back to top
View user's profile Send private message
nmcsween
Guru
Guru


Joined: 12 Nov 2003
Posts: 381

PostPosted: Tue Apr 13, 2004 8:52 pm    Post subject: Reply with quote

What I have in mind is to see how every flag preforms.
Quote:
GCC supports different switch sets for different languages and processors; in the case of the Pentium 4, for example, more than 60 different switches affect code optimization and generation. Testing every possible combination of those switches is simply impractical. Even at the unattainable theoretical speed of one test per second, it would take over hundreds of billions of years to run the complete set of possibilities!
That is close to what I have in mind but to make it a reality why not only try the optimistic switches for that proc. Now this would reqiure some kind of distrubuted computing effort but it could be done. Maybe a online database on what switches have been tested on each proc? and a perl script that fetches that information and tests the next switche(s)?
Back to top
View user's profile Send private message
nmcsween
Guru
Guru


Joined: 12 Nov 2003
Posts: 381

PostPosted: Tue Apr 13, 2004 11:48 pm    Post subject: Reply with quote

Even if it was just the overall best cflags in the a database it would save people a considerable amount of time.
Back to top
View user's profile Send private message
wilburpan
l33t
l33t


Joined: 21 Jan 2003
Posts: 977

PostPosted: Wed Apr 14, 2004 3:08 am    Post subject: Reply with quote

How long does it take to run the two acovea scripts (which I very imaginatively named acovea1 and acovea2, by the way :? )? I'm using a 700 MHz P3.
_________________
I'm only hanging out in OTW until I get rid of this stupid l33t ranking.....Crap. That didn't work.
Back to top
View user's profile Send private message
Hypnos
Advocate
Advocate


Joined: 18 Jul 2002
Posts: 2889
Location: Omnipresent

PostPosted: Wed Apr 14, 2004 3:10 am    Post subject: Reply with quote

Ultraoctane.com wrote:
Even if it was just the overall best cflags in the a database it would save people a considerable amount of time.

I agree.

So then, maybe a web interface where people enter the CPU specs and benchmark numbers or summary results from my script?
_________________
Personal overlay | Simple backup scheme
Back to top
View user's profile Send private message
Hypnos
Advocate
Advocate


Joined: 18 Jul 2002
Posts: 2889
Location: Omnipresent

PostPosted: Wed Apr 14, 2004 3:32 am    Post subject: Reply with quote

wilburpan wrote:
How long does it take to run the two acovea scripts (which I very imaginatively named acovea1 and acovea2, by the way :? )? I'm using a 700 MHz P3.

The first bash script, ~48 hours on my 1.6GHz P4; the second Perl script, a few milliseconds.
_________________
Personal overlay | Simple backup scheme
Back to top
View user's profile Send private message
nmcsween
Guru
Guru


Joined: 12 Nov 2003
Posts: 381

PostPosted: Wed Apr 14, 2004 4:30 am    Post subject: Reply with quote

I think it would be better if it was inputed directly via a upload. A web interface seems like a burden on whoever runs the test.We could load it with a perl socket connection. Although I think I need root access on the server to use perl sockets. If anyone knows of a better way please post.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page Previous  1, 2, 3, 4 ... 14, 15, 16  Next
Page 3 of 16

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum