Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Discussion & Documentation Gentoo Chat
  • Search

Acovea-4.0.0 : Try out my ebuilds (and scripts)

Opinions, ideas and thoughts about Gentoo. Anything and everything about Gentoo except support questions.
Post Reply
  • Print view
Advanced search
382 posts
  • Page 6 of 16
    • Jump to page:
  • Previous
  • 1
  • …
  • 4
  • 5
  • 6
  • 7
  • 8
  • …
  • 16
  • Next
Author
Message
Vagabond
Apprentice
Apprentice
Posts: 192
Joined: Sun Jan 19, 2003 2:42 pm

  • Quote

Post by Vagabond » Thu Apr 22, 2004 8:41 pm

Ack, this is *slow*, 3 hours to do the alma test, and there's 6 more to go. I guess the problem is that I have a SMP system and its only using one CPU. At least that means I have 1 CPU free to do other stuff without a noticeable performance hit, still its gonna take forever at this rate.

Vag
Top
robmoss
Retired Dev
Retired Dev
Posts: 2634
Joined: Tue May 27, 2003 4:42 pm
Location: Jesus College, Oxford
Contact:
Contact robmoss
Website

  • Quote

Post by robmoss » Thu Apr 22, 2004 8:52 pm

Of course it's slow, that's the point - it's very thorough...
Reality is for those who can't face Science Fiction.

emerge -U will kill your Gentoo
ecatmur, Lord of Portage Bash Scripts
Top
wilburpan
l33t
l33t
User avatar
Posts: 977
Joined: Tue Jan 21, 2003 3:48 pm

  • Quote

Post by wilburpan » Fri Apr 23, 2004 2:14 pm

So after ~150 total hours of compiling, I have results from the acovea script run under two different conditions. These were done on a 700 MHz P3 laptop with 512 MB RAM. Just to get to the point, because this will be a long post, the conditions under which the acovea scripts are run influence the output.

My first run was done in console mode. I just left my laptop on my desk, and did not do anything with it other than let it generate heat while running the acovea script.

Code: Select all

 Score |  So?  | Switch (annotation)
------------------------------------------------------------------------------
  31.7 |  Yes  | -malign-double
  31.5 |  Yes  | -fcaller-saves (-O2)
  31.2 |  Yes  | -foptimize-sibling-calls (-O2)
  30.9 |  Yes  | -freorder-blocks (-O2)
  30.4 |  Yes  | -fsched-interblock (-O2 GCC 3.3)
  29.8 | Maybe | -ftracer
  29.2 |  Yes  | -fdelete-null-pointer-checks (-O2)
  29.1 | Maybe | -funsafe-math-optimizations (fast math)
  29.1 |  Yes  | -fmove-all-movables
  29.0 |  Yes  | -fno-if-conversion2 (! -O1)
  28.6 | Maybe | -fgcse (-O2)
  27.5 | Maybe | -finline-limit
  27.1 |  Yes  | -fno-thread-jumps (! -O1)
  27.1 | Maybe | -finline-functions (-O3)
  26.1 |  Yes  | -fno-defer-pop (! -O1)
  26.1 |  Yes  | -fsched-spec (-O2 GCC 3.3)
  26.0 | Maybe | -fstrict-aliasing (-O2)
  25.7 |  Yes  | -ffinite-math-only (fast math)
  25.6 | Maybe | -fexpensive-optimizations (-O2)
  25.6 | Maybe | -fno-math-errno (fast math)
  25.0 | Maybe | -fno-trapping-math (fast math)
  24.9 |  Yes  | -fpeephole2 (-O2)
  24.8 | Maybe | -fschedule-insns2 (-O2)
  24.8 |  Yes  | -falign-jumps (-O2 GCC 3.3)
  24.6 |  Yes  | -falign-labels (-O2 GCC 3.3)
  24.4 | Maybe | -fprefetch-loop-arrays
  24.3 | Maybe | -mno-align-stringops
  23.7 | Maybe | -freorder-functions (-O2 GCC 3.3)
  23.6 | Maybe | -frename-registers (-O3)
  23.2 | Maybe | -falign-loops (-O2 GCC 3.3)
  22.5 | Maybe | -fcse-follow-jumps (-O2)
  21.8 | Maybe | -fno-delayed-branch (! -O1)
  21.8 | Maybe | -fno-omit-frame-pointer (! -O1)
  21.6 | Maybe | -fno-crossjumping (! -O1)
  21.6 | Maybe | -frerun-cse-after-loop (-O2)
  21.4 | Maybe | -fcse-skip-blocks (-O2)
  20.9 | Maybe | -mieee-fp
  20.9 | Maybe | -frerun-loop-opt (-O2)
  20.7 | Maybe | -fno-cprop-registers (! -O1)
  20.6 | Maybe | -maccumulate-outgoing-args
  20.2 | Maybe | -fno-signaling-nans (fast math)
  19.6 | Maybe | -fno-merge-constants (! -O1)
  19.3 |   No  | -fforce-mem (-O2)
  19.0 | Maybe | -mno-push-args
  18.2 |   No  | -fno-if-conversion (! -O1)
  18.2 | Maybe | -minline-all-stringops
  15.6 |   No  | -freduce-all-givs
  15.1 |   No  | -fstrength-reduce (-O2)
  11.8 |   No  | -fnew-ra
  11.6 |   No  | -fno-guess-branch-probability (! -O1)
  11.1 |   No  | -fschedule-insns (-O2)
  10.4 |   No  | -ffloat-store
  10.4 |   No  | -fregmove (-O2)
   9.6 |   No  | -fno-inline
   9.4 |   No  | -funroll-all-loops
   8.5 |   No  | -fomit-frame-pointer
   7.2 |   No  | -funroll-loops
   0.0 |   No  | -fno-loop-optimize (! -O1)
   0.0 |   No  | -mfpmath=387
   0.0 |   No  | -mfpmath=sse
   0.0 |   No  | -mfpmath=sse,387
   0.0 |   No  | -momit-leaf-frame-pointer
My second run was done from within a Konsole terminal while running KDE. During the 84 hours it took to do this run, I was using my computer for desktop-type work -- email, surfing, wordprocessing, etc. I did not do any other emerging/compiling while this was going on.

Code: Select all

 Score |  So?  | Switch (annotation)
------------------------------------------------------------------------------
  35.4 |  Yes  | -maccumulate-outgoing-args
  32.5 | Maybe | -fstrict-aliasing (-O2)
  32.3 | Maybe | -fgcse (-O2)
  31.8 |  Yes  | -fno-cprop-registers (! -O1)
  31.6 |  Yes  | -fno-trapping-math (fast math)
  30.6 |  Yes  | -fexpensive-optimizations (-O2)
  30.4 |  Yes  | -fno-delayed-branch (! -O1)
  30.0 |  Yes  | -falign-jumps (-O2 GCC 3.3)
  29.9 |  Yes  | -frerun-loop-opt (-O2)
  29.8 |  Yes  | -minline-all-stringops
  29.8 |  Yes  | -mieee-fp
  29.7 |  Yes  | -fmove-all-movables
  28.9 |  Yes  | -fno-omit-frame-pointer (! -O1)
  28.8 |  Yes  | -fsched-interblock (-O2 GCC 3.3)
  28.8 | Maybe | -freorder-blocks (-O2)
  28.5 |  Yes  | -freorder-functions (-O2 GCC 3.3)
  28.0 |  Yes  | -fno-merge-constants (! -O1)
  27.8 |  Yes  | -frerun-cse-after-loop (-O2)
  27.7 |  Yes  | -fschedule-insns2 (-O2)
  27.4 |  Yes  | -fdelete-null-pointer-checks (-O2)
  27.4 |  Yes  | -ffinite-math-only (fast math)
  27.2 |  Yes  | -finline-functions (-O3)
  26.3 |  Yes  | -fcse-skip-blocks (-O2)
  26.2 |  Yes  | -falign-labels (-O2 GCC 3.3)
  26.0 | Maybe | -falign-loops (-O2 GCC 3.3)
  25.9 |  Yes  | -fno-if-conversion2 (! -O1)
  25.7 | Maybe | -fcse-follow-jumps (-O2)
  25.5 |  Yes  | -fcaller-saves (-O2)
  25.3 | Maybe | -fno-thread-jumps (! -O1)
  25.1 |  Yes  | -fpeephole2 (-O2)
  24.6 | Maybe | -fforce-mem (-O2)
  24.5 |  Yes  | -fprefetch-loop-arrays
  24.3 | Maybe | -frename-registers (-O3)
  24.2 | Maybe | -funsafe-math-optimizations (fast math)
  23.8 |  Yes  | -foptimize-sibling-calls (-O2)
  23.6 | Maybe | -fno-defer-pop (! -O1)
  23.3 |  Yes  | -fstrength-reduce (-O2)
  23.3 |  Yes  | -fsched-spec (-O2 GCC 3.3)
  23.2 | Maybe | -mno-push-args
  23.0 | Maybe | -ftracer
  22.7 |  Yes  | -fregmove (-O2)
  22.0 | Maybe | -fno-crossjumping (! -O1)
  21.8 | Maybe | -malign-double
  21.4 | Maybe | -freduce-all-givs
  20.6 | Maybe | -finline-limit
  19.7 | Maybe | -fno-math-errno (fast math)
  19.4 | Maybe | -fno-signaling-nans (fast math)
  18.3 |   No  | -fschedule-insns (-O2)
  17.1 | Maybe | -mno-align-stringops
  16.7 |   No  | -fno-if-conversion (! -O1)
  16.5 |   No  | -fno-inline
  12.4 |   No  | -ffloat-store
  11.6 |   No  | -fno-guess-branch-probability (! -O1)
  11.0 | Maybe | -mfpmath=sse
  10.2 |   No  | -funroll-loops
   9.8 |   No  | -funroll-all-loops
   9.5 |   No  | -fnew-ra
   9.3 |   No  | -fno-loop-optimize (! -O1)
   8.9 |   No  | -fomit-frame-pointer
   0.0 |   No  | -mfpmath=387
   0.0 |   No  | -mfpmath=sse,387
   0.0 |   No  | -momit-leaf-frame-pointer
As you can see, the results are quite different. I don't know why the running environment should affect the acovea results, and I'm not sure which set of recommendations I should use.

Any comments would be welcome.
I'm only hanging out in OTW until I get rid of this stupid l33t ranking.....Crap. That didn't work.
Top
Daagar
Tux's lil' helper
Tux's lil' helper
Posts: 78
Joined: Fri Mar 14, 2003 7:57 pm

  • Quote

Post by Daagar » Fri Apr 23, 2004 6:36 pm

This all sounds like great geeky fun, but I'm curious if there is a way to do the tests in steps (some sort of pause/restart feature). My old Athlon Tbird 933MHz can't devote the 48-72hours it would take in a single sitting, and it would be nice to let it simply run overnight and be able to stop it when necessary and restart the next night...
Top
Vagabond
Apprentice
Apprentice
Posts: 192
Joined: Sun Jan 19, 2003 2:42 pm

  • Quote

Post by Vagabond » Fri Apr 23, 2004 6:41 pm

you can do the tests one by one, the longest I've had a test take so far (down to the last 2) is 6 hours. The tests each generate a .run file that the perl script in this thread uses to generate its recommendations.

Vag
Top
aethyr
Veteran
Veteran
User avatar
Posts: 1085
Joined: Sun Apr 06, 2003 5:16 pm
Location: NYC

  • Quote

Post by aethyr » Fri Apr 23, 2004 7:04 pm

Daagar wrote:This all sounds like great geeky fun, but I'm curious if there is a way to do the tests in steps (some sort of pause/restart feature). My old Athlon Tbird 933MHz can't devote the 48-72hours it would take in a single sitting, and it would be nice to let it simply run overnight and be able to stop it when necessary and restart the next night...
CTRL-Z the same way you can pause any process ;)

Then later when you want it to run again, "fg" to resume in the foreground, or "bg" to resume in the background.
Top
robmoss
Retired Dev
Retired Dev
Posts: 2634
Joined: Tue May 27, 2003 4:42 pm
Location: Jesus College, Oxford
Contact:
Contact robmoss
Website

  • Quote

Post by robmoss » Fri Apr 23, 2004 9:06 pm

wilburpan wrote:As you can see, the results are quite different. I don't know why the running environment should affect the acovea results, and I'm not sure which set of recommendations I should use.
Use the former set. As previously stated, Acovea uses real time, not CPU time. Thus, the latter set are meaningless (sorry!). Acovea should be run with as little overhead as is possible. This includes stopping any distributed computing project clients, such as SETI@home or mprime, whilst the run is in progress.
Reality is for those who can't face Science Fiction.

emerge -U will kill your Gentoo
ecatmur, Lord of Portage Bash Scripts
Top
darkless
n00b
n00b
Posts: 42
Joined: Thu Jan 01, 2004 2:21 pm
Location: Denmark

  • Quote

Post by darkless » Sat Apr 24, 2004 11:00 am

I wonder what it'll take to rewrite acovea to use user time instead of real time, so it won't be neccessary to run the tests while the system is otherwise left idle.
Ignorance should be painful.
Top
poisson
n00b
n00b
User avatar
Posts: 35
Joined: Sun Nov 24, 2002 2:13 pm

  • Quote

Post by poisson » Sat Apr 24, 2004 5:14 pm

darkless wrote:I wonder what it'll take to rewrite acovea to use user time instead of real time, so it won't be neccessary to run the tests while the system is otherwise left idle.
I tried to benchmark many programs (mainly for floating point) in working environments. Both system & user time are NOT accurate. I think it's a scheduler-related problem (not only linux, but also cray, sp4, compaq). The only way to get good values from a benchmark is running it in a single user mode system.

Obviously considering user time instead of system time it's better.

I'm running acovea on my new dual opteron 242. Stay tuned :-)
Top
Daagar
Tux's lil' helper
Tux's lil' helper
Posts: 78
Joined: Fri Mar 14, 2003 7:57 pm

  • Quote

Post by Daagar » Sat Apr 24, 2004 5:50 pm

Vagabond, thanks. Didn't realize that each test was separate, so I can run one per night or something. As for the suggestion to just ctrl-z the process, yes of course I realized that would work, but I have to usually reboot back to Windows for the family during the day making that solution not practical.
Top
poisson
n00b
n00b
User avatar
Posts: 35
Joined: Sun Nov 24, 2002 2:13 pm

  • Quote

Post by poisson » Sun Apr 25, 2004 9:04 pm

As promised, these are the results of acovea on my system
(Dual Opteron 242, 2x512MB, MSI mobo)

Code: Select all

 Score |  So?  | Switch (annotation)
------------------------------------------------------------------------------
  45.1 |  Yes  | -funsafe-math-optimizations (fast math)
  43.1 |  Yes  | -ftracer
  36.2 |  Yes  | -fcaller-saves (-O2)
  36.1 |  Yes  | -fforce-mem (-O2)
  35.6 | Maybe | -mieee-fp
  34.5 |  Yes  | -fno-defer-pop (! -O1)
  34.0 |  Yes  | -falign-jumps (-O2 GCC 3.3)
  33.5 | Maybe | -fschedule-insns (-O2)
  33.2 |  Yes  | -fdelete-null-pointer-checks (-O2)
  33.1 |  Yes  | -fpeephole2 (-O2)
  32.7 | Maybe | -fregmove (-O2)
  32.7 |  Yes  | -finline-limit
  32.3 |  Yes  | -falign-labels (-O2 GCC 3.3)
  32.1 |  Yes  | -fcse-skip-blocks (-O2)
  32.0 | Maybe | -fgcse (-O2)
  31.6 |  Yes  | -freorder-blocks (-O2)
  30.7 |  Yes  | -fcse-follow-jumps (-O2)
  30.7 |  Yes  | -frename-registers (-O3)
  30.6 |  Yes  | -mno-align-stringops
  30.3 |  Yes  | -fno-if-conversion2 (! -O1)
  29.9 | Maybe | -fno-thread-jumps (! -O1)
  29.5 | Maybe | -fstrict-aliasing (-O2)
  29.3 |  Yes  | -maccumulate-outgoing-args
  28.9 | Maybe | -finline-functions (-O3)
  28.7 |  Yes  | -minline-all-stringops
  28.5 | Maybe | -fno-crossjumping (! -O1)
  28.4 |  Yes  | -fno-cprop-registers (! -O1)
  27.7 |  Yes  | -fsched-interblock (-O2 GCC 3.3)
  27.6 | Maybe | -fstrength-reduce (-O2)
  26.7 | Maybe | -fno-delayed-branch (! -O1)
  26.4 |  Yes  | -freorder-functions (-O2 GCC 3.3)
  26.3 | Maybe | -fno-omit-frame-pointer (! -O1)
  25.8 |  Yes  | -fmove-all-movables
  25.5 | Maybe | -fschedule-insns2 (-O2)
  25.4 | Maybe | -falign-loops (-O2 GCC 3.3)
  25.0 | Maybe | -fsched-spec (-O2 GCC 3.3)
  24.9 |   No  | -fprefetch-loop-arrays
  24.6 |  Yes  | -fexpensive-optimizations (-O2)
  24.0 |  Yes  | -ffinite-math-only (fast math)
  22.5 | Maybe | -fno-inline
  21.8 | Maybe | -mno-push-args
  21.4 | Maybe | -fno-signaling-nans (fast math)
  20.9 | Maybe | -funroll-loops
  20.8 | Maybe | -fno-merge-constants (! -O1)
  19.8 | Maybe | -freduce-all-givs
  19.4 | Maybe | -fno-math-errno (fast math)
  19.2 |   No  | -funroll-all-loops
  19.0 | Maybe | -foptimize-sibling-calls (-O2)
  18.7 |   No  | -fnew-ra
  18.5 |   No  | -mfpmath=387
  15.8 | Maybe | -fno-trapping-math (fast math)
  14.8 |   No  | -fno-if-conversion (! -O1)
  14.6 |   No  | -ffloat-store
  14.3 | Maybe | -frerun-cse-after-loop (-O2)
  12.3 |   No  | -frerun-loop-opt (-O2)
  11.3 |   No  | -mfpmath=sse,387
  10.3 |   No  | -fno-guess-branch-probability (! -O1)
   0.0 |   No  | -fno-loop-optimize (! -O1)
   0.0 |   No  | -mfpmath=sse
Then I tried "nbench" with std optimization

Code: Select all

CFLAGS=-s -static -Wall -O2
CPU                 : Dual AuthenticAMD AMD Opteron(tm) Processor 242 1604MHz
L2 Cache            : 1024 KB
OS                  : Linux 2.6.5-gentoo-r1
C compiler          : 3.3.3
MEMORY INDEX        : 11.142
INTEGER INDEX       : 10.406
FLOATING-POINT INDEX: 15.911
and with acovea optimization

Code: Select all

CFLAGS = -s -static -Wall -O1 -funsafe-math-optimizations -ftracer -fcaller-saves -fforce-mem -fno-defer-pop -falign-jumps -fdelete-null-pointer-checks -fpeephole2 -finline-limit=600 -falign-labels -fcse-skip-blocks -freorder-blocks -fcse-follow-jumps -frename-registers -mno-align-stringops -fno-if-conversion2 -maccumulate-outgoing-args -minline-all-stringops -fno-cprop-registers -fsched-interblock -freorder-functions -fmove-all-movables -fexpensive-optimizations -ffinite-math-only

CPU                 : Dual AuthenticAMD AMD Opteron(tm) Processor 242 1604MHz
L2 Cache            : 1024 KB
OS                  : Linux 2.6.5-gentoo-r1
C compiler          : 3.3.3 
MEMORY INDEX        : 10.553
INTEGER INDEX       : 9.486
FLOATING-POINT INDEX: 17.037
As you can see, floating-point is 7% better, but memory (-5%) and integer (-9%) suggest that acovea flags are not useful for workstation use.

Also, -funsafe-math-optimizations gives the boost, but it's deprecated.
I think -O2 is the best choice for my gentoo installation, but I will try other combinations , removing deprecated or conflicting flags suggested by acovea.
Top
aethyr
Veteran
Veteran
User avatar
Posts: 1085
Joined: Sun Apr 06, 2003 5:16 pm
Location: NYC

  • Quote

Post by aethyr » Sun Apr 25, 2004 10:19 pm

I don't know if nbench is the best choice...
These are Native Mode (a.k.a. Algorithm Level) tests; benchmarks designed to expose the capabilities of a system's CPU, FPU, and memory system.
I still don't know what the best benchmark would be though... I asked Scott in another thead, still waiting to hear back from him.
Top
poisson
n00b
n00b
User avatar
Posts: 35
Joined: Sun Nov 24, 2002 2:13 pm

  • Quote

Post by poisson » Sun Apr 25, 2004 10:49 pm

aethyr wrote:I don't know if nbench is the best choice...
These are Native Mode (a.k.a. Algorithm Level) tests; benchmarks designed to expose the capabilities of a system's CPU, FPU, and memory system.
I still don't know what the best benchmark would be though... I asked Scott in another thead, still waiting to hear back from him.
Acovea uses similar alghoritms. It is good in an "ideal" world, where the machine is dedicated to number-crunching.

For multi-pourpose machines (ie workstations), there are a lot of parameters, and I think the best benchmark is X/Kde/Gnome/OpenOffice startup :-)
Top
robmoss
Retired Dev
Retired Dev
Posts: 2634
Joined: Tue May 27, 2003 4:42 pm
Location: Jesus College, Oxford
Contact:
Contact robmoss
Website

  • Quote

Post by robmoss » Sun Apr 25, 2004 11:03 pm

poisson wrote:As you can see, floating-point is 7% better, but memory (-5%) and integer (-9%) suggest that acovea flags are not useful for workstation use.

Also, -funsafe-math-optimizations gives the boost, but it's deprecated.
I think -O2 is the best choice for my gentoo installation, but I will try other combinations , removing deprecated or conflicting flags suggested by acovea.
The above combination requires that you use -funsafe-math-optimizations, otherwise you're breaking Acovea's method. Removing it from your profile will give you an entirely different set of results.
Reality is for those who can't face Science Fiction.

emerge -U will kill your Gentoo
ecatmur, Lord of Portage Bash Scripts
Top
darkless
n00b
n00b
Posts: 42
Joined: Thu Jan 01, 2004 2:21 pm
Location: Denmark

  • Quote

Post by darkless » Mon Apr 26, 2004 7:37 am

To put that in other words: If there are certain flags known to break things for you (or you just don't feel like using them) then prevent Acovea from using them in the first place.

This can be done by modifying eg. /usr/share/acovea/config/gcc34_pentium4.acovea to not include specific flags.

Personally, I don't feel like doing an "emerge -e world" right now, so I'd like to prevent acovea from using -malign-double. Also, the -funit-at-a-time flag has been known to break a few apps, and it somewhat increases compile time as well, so that might be another candidate for removal, until GCC-3.4 gets more widely adopted by software developers and/or gets more mature.
Ignorance should be painful.
Top
poisson
n00b
n00b
User avatar
Posts: 35
Joined: Sun Nov 24, 2002 2:13 pm

  • Quote

Post by poisson » Mon Apr 26, 2004 7:59 am

The above combination requires that you use -funsafe-math-optimizations, otherwise you're breaking Acovea's method. Removing it from your profile will give you an entirely different set of results.
Acovea method works fine for specific problems, the profiles are always kept separate. IMHO putting all together in make.conf will slow down the whole system.

Other tests I made indicate that -O3 optimization is generally better. But gcc people warned about -O3 and x86-64 ... so I use -O2 for the moment.

I found another interesting Acovea application: what are the best optimizations for pentium-m? You know, such processor is an hybrid between pentium3 and pentium4, with 1M L2 cache. I started with "alma", but I don't like to stress my laptop :-)
Top
Hypnos
Advocate
Advocate
User avatar
Posts: 2889
Joined: Thu Jul 18, 2002 5:12 pm
Location: Omnipresent

  • Quote

Post by Hypnos » Mon Apr 26, 2004 9:13 am

It might be possible to construct a benchmark that puts the entire X/glib/GTK+/GNOME (or X/Qt/KDE) code stack through the ringer (maybe based on an automated gtk-demo), and returns a single fitness number to Acovea. That would likely be more realistic (w.r.t. desktop performance) than running through tight computational loops as with the current benchmarks bundled with Acovea.

Of course, the compile-run cycle for the entire stack would be rather time-consuming; even limiting to just glib/GTK+, you're looking at ~8MB of source code and ~4.5MB of machine code.

Perhaps to start with one could restrict to glib, and run the test routines that come with it as a composite Acovea bench (with per test weights), but that would have no X dependence whatsoever.

Thoughts?

--------------

One the issue of running the tests one at a time, you could always just hit ctl-z to freeze the script when you wake up and then hit it again to start it up when you go to bed.
Personal overlay | Simple backup scheme
Top
ett_gramse_nap
Apprentice
Apprentice
User avatar
Posts: 252
Joined: Wed Oct 01, 2003 6:54 am
Location: Göteborg, Sweden

  • Quote

Post by ett_gramse_nap » Mon Apr 26, 2004 11:49 am

Could one safely use cflags suggested by Acovea when bootstrapping?
Don't bother!
Top
Hypnos
Advocate
Advocate
User avatar
Posts: 2889
Joined: Thu Jul 18, 2002 5:12 pm
Location: Omnipresent

  • Quote

Post by Hypnos » Mon Apr 26, 2004 12:08 pm

ett_gramse_nap wrote:Could one safely use cflags suggested by Acovea when bootstrapping?
I don't know if I would risk binutils/glibc on such flags ....
Personal overlay | Simple backup scheme
Top
solka
Apprentice
Apprentice
User avatar
Posts: 287
Joined: Wed Jun 25, 2003 10:14 pm
Location: Torino - ITA

  • Quote

Post by solka » Mon Apr 26, 2004 3:54 pm

Hi all, I've run acovea in console without other processes running and after ~29 hours it finished and here is the result.

[System: Athlon XP 2100+@1916hz, 512mb DDR Corsair, Asus A7V8X Motherboard]

Code: Select all

 Score |  So?  | Switch (annotation)
------------------------------------------------------------------------------
  36.2 |  Yes  | -fno-delayed-branch (! -O1)
  33.1 | Maybe | -fprefetch-loop-arrays
  32.9 | Maybe | -funsafe-math-optimizations (fast math)
  32.9 | Maybe | -fstrict-aliasing (-O2)
  31.2 |  Yes  | -fno-signaling-nans (fast math)
  30.4 |  Yes  | -falign-labels (-O2 GCC 3.3)
  29.7 | Maybe | -minline-all-stringops
  29.5 | Maybe | -ftracer
  27.8 |  Yes  | -fno-cprop-registers (! -O1)
  27.5 |  Yes  | -frerun-cse-after-loop (-O2)
  27.1 |   No  | -fforce-mem (-O2)
  27.1 |  Yes  | -fsched-interblock (-O2 GCC 3.3)
  27.0 |  Yes  | -fno-defer-pop (! -O1)
  26.7 |  Yes  | -mno-align-stringops
  26.7 | Maybe | -fcse-follow-jumps (-O2)
  26.3 |  Yes  | -fsched-spec (-O2 GCC 3.3)
  26.2 | Maybe | -finline-functions (-O3)
  26.0 |  Yes  | -fpeephole2 (-O2)
  26.0 |  Yes  | -fno-math-errno (fast math)
  25.8 |  Yes  | -freorder-functions (-O2 GCC 3.3)
  25.7 | Maybe | -fcse-skip-blocks (-O2)
  25.0 | Maybe | -falign-jumps (-O2 GCC 3.3)
  24.6 | Maybe | -fno-trapping-math (fast math)
  23.8 |   No  | -fstrength-reduce (-O2)
  23.7 |  Yes  | -fno-crossjumping (! -O1)
  23.4 | Maybe | -fno-if-conversion2 (! -O1)
  23.3 | Maybe | -mieee-fp
  22.7 | Maybe | -ffinite-math-only (fast math)
  22.7 | Maybe | -fno-merge-constants (! -O1)
  21.9 | Maybe | -frename-registers (-O3)
  21.5 | Maybe | -fregmove (-O2)
  20.8 |   No  | -fgcse (-O2)
  20.7 |   No  | -fcaller-saves (-O2)
  20.3 |   No  | -fschedule-insns2 (-O2)
  19.5 |   No  | -falign-loops (-O2 GCC 3.3)
  19.4 | Maybe | -freorder-blocks (-O2)
  19.3 | Maybe | -fno-thread-jumps (! -O1)
  18.0 |   No  | -fno-if-conversion (! -O1)
  17.9 | Maybe | -finline-limit
  17.9 | Maybe | -fno-omit-frame-pointer (! -O1)
  16.6 |   No  | -maccumulate-outgoing-args
  16.3 |   No  | -mno-push-args
  15.2 |   No  | -foptimize-sibling-calls (-O2)
  14.9 |   No  | -fno-inline
  14.7 |   No  | -fdelete-null-pointer-checks (-O2)
  14.6 |   No  | -frerun-loop-opt (-O2)
  13.9 |   No  | -fexpensive-optimizations (-O2)
  12.7 |   No  | -freduce-all-givs
  12.5 |   No  | -fmove-all-movables
  10.9 | Maybe | -mfpmath=sse,387
  10.9 |   No  | -fnew-ra
   8.3 |   No  | -fschedule-insns (-O2)
   7.6 |   No  | -fno-guess-branch-probability (! -O1)
   6.5 |   No  | -ffloat-store
   6.4 |   No  | -funroll-all-loops
   4.6 |   No  | -funroll-loops
   3.7 |   No  | -fno-loop-optimize (! -O1)
   0.0 |   No  | -mfpmath=387
   0.0 |   No  | -mfpmath=sse
In particular:

Code: Select all

 36.2 |  Yes  | -fno-delayed-branch (! -O1)
  31.2 |  Yes  | -fno-signaling-nans (fast math)
  30.4 |  Yes  | -falign-labels (-O2 GCC 3.3)
  27.8 |  Yes  | -fno-cprop-registers (! -O1)
  27.5 |  Yes  | -frerun-cse-after-loop (-O2)
  27.1 |  Yes  | -fsched-interblock (-O2 GCC 3.3)
  27.0 |  Yes  | -fno-defer-pop (! -O1)
  26.7 |  Yes  | -mno-align-stringops
  26.3 |  Yes  | -fsched-spec (-O2 GCC 3.3)
  26.0 |  Yes  | -fpeephole2 (-O2)
  26.0 |  Yes  | -fno-math-errno (fast math)
  25.8 |  Yes  | -freorder-functions (-O2 GCC 3.3)
  23.7 |  Yes  | -fno-crossjumping (! -O1)
Can I safely put these settings in my cflags? Are there any other flags which acovea doesn't show but which is better to put in [-Wall -pipe ?]?
And what's the meaning of a bang before -O1 ?
Many thanks for answers.
"The only reason of the man's sadness
is that he can't stay peacefully in his room."

Blaise Pascal
Top
Daagar
Tux's lil' helper
Tux's lil' helper
Posts: 78
Joined: Fri Mar 14, 2003 7:57 pm

  • Quote

Post by Daagar » Mon Apr 26, 2004 4:04 pm

! -O1 (read as 'not -O1') means it is explicitly turning off an option that is normally enabled when you specify -O1.
Top
solka
Apprentice
Apprentice
User avatar
Posts: 287
Joined: Wed Jun 25, 2003 10:14 pm
Location: Torino - ITA

  • Quote

Post by solka » Mon Apr 26, 2004 4:19 pm

So if I put -O2 in my cflags, would it comprise also the flags with ! -O1 or do I have to put them anyhow?
"The only reason of the man's sadness
is that he can't stay peacefully in his room."

Blaise Pascal
Top
Daagar
Tux's lil' helper
Tux's lil' helper
Posts: 78
Joined: Fri Mar 14, 2003 7:57 pm

  • Quote

Post by Daagar » Mon Apr 26, 2004 4:19 pm

poisson wrote:
The above combination requires that you use -funsafe-math-optimizations, otherwise you're breaking Acovea's method. Removing it from your profile will give you an entirely different set of results.
Acovea method works fine for specific problems, the profiles are always kept separate. IMHO putting all together in make.conf will slow down the whole system.
So based on these findings, are we basically saying that while Acovea does the job it was set out to do, but that the current set of benchmarks is not really appropriate for the use gentoo'ers are trying to use it for (getting a set of CFLAGS for their make.conf)? Or is this something specific to the amd64 architecture, since others such as Hypnos have claimed general improvements to their system since implementing acovea-suggested flags?

Basically, for the benefit of others reading this thread, is it currently worth the 30-72hours necessary to run acovea to generate system-wide CFLAGS?
Top
Hypnos
Advocate
Advocate
User avatar
Posts: 2889
Joined: Thu Jul 18, 2002 5:12 pm
Location: Omnipresent

  • Quote

Post by Hypnos » Mon Apr 26, 2004 9:20 pm

Daagar wrote:Basically, for the benefit of others reading this thread, is it currently worth the 30-72hours necessary to run acovea to generate system-wide CFLAGS?
Depends -- would you be happy with a 0-3% performance improvement for that time invested?
Personal overlay | Simple backup scheme
Top
Daagar
Tux's lil' helper
Tux's lil' helper
Posts: 78
Joined: Fri Mar 14, 2003 7:57 pm

  • Quote

Post by Daagar » Mon Apr 26, 2004 11:40 pm

Hypnos wrote:
Daagar wrote:Basically, for the benefit of others reading this thread, is it currently worth the 30-72hours necessary to run acovea to generate system-wide CFLAGS?
Depends -- would you be happy with a 0-3% performance improvement for that time invested?
Heheh... for me personally, sure. I'm twisted like that. However, as the previous poster had found, there are instances where the performance goes _backwards_. I guess the question is whether the perofrmance gains will in general outweigh the reverse for an average gentoo'ers system (based on the assumption that most gentoo'ers are in a workstation envrionment, and not doing 24/7 number crunching). .
Top
Post Reply
  • Print view

382 posts
  • Page 6 of 16
    • Jump to page:
  • Previous
  • 1
  • …
  • 4
  • 5
  • 6
  • 7
  • 8
  • …
  • 16
  • Next

Return to “Gentoo Chat”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic