Acovea analysis results against real world programs

Twist · Guru Joined: 03 Jan 2003 Posts: 414 Location: San Diego

Well, it's not very good. I have been testing my acovea flag results (posted here ) against more traditional "optimized" CFLAGS. The results have not argued strongly in favor of using Acovea based recommendations.

My system is as follows:
Athlon64 3400+ w/1GB memory
Gentoo 2004.3 stable, with exceptions noted
gcc-3.4.3, glibc-2.3.4.20040808-r1

For each test, I would run the given app against sample data three times with my "normal" CFLAGS, then recompile and run it three times against the acovea CFLAGS, averaging the results. No other significant load existed at the time on the machine. No window system was running (GDM was and therefor xorg, as were my standard services like NFS and Samba, but they weren't actively doing anything). The actual tests were performed from an SSH session from another machine.

My original acovea results:

ebrostig · Posted: Mon Dec 27, 2004 1:11 am Post subject:

It is difficult to set individual flags that will give an overall improvement in speed. It all depends on what the program you want to run does and how it does it internally. In order to optimize a specific program you will have to perform the type of tests that you have done and adjust flags individually. That is not desirable in general.

The gcc suite sets internally many flags based on the -O? flag, they are all documented in the gcc man pages.

I have done numerous tests myself on my AMD64 3200+ and have come up with a set of flags that overall gives the most optimal performance and stability. The last is not the least important, as you found out with some programs that segfaulted when run.

In general, it is best to stick with a minimal amount of flags and use the ones recommended for each platform.

I think you have done a great job and I applaud you for your persistence in testing the various combination. Great write-up!

Erik
_________________
'Yes, Firefox is indeed greater than women. Can women block pops up for you? No. Can Firefox show you naked women? Yes.'

georgz · Posted: Wed Dec 29, 2004 12:30 pm Post subject:

smokeslikeapoet · Posted: Wed Dec 29, 2004 12:36 pm Post subject:

Instead of using acovea I benchmarked my system in much the same way. I used Lame and some default optimizations. I md5 summed all of the resulting mp3s. -O3 gave me the best time. The I started adding other combinations of cflags until I started noticing speed improvements. Again I md5 summed the resulting mp3s. I threw out the cflags that gave me different md5 sums, most notably -ffast-math. Then I started taking the cflags out that gave me no significant improvement in encoding time until I was left with the minimal cflags that reduced my encoding time by 40%. In case you were wondering here are my -cflags for my Athlon 1800+ on an Epox Via 8HKA+.

MighMoS · Guru Joined: 24 Apr 2003 Posts: 416 Location: @ ~

I'm curious as to people using -O3, due to the fact that most tests agree that inlining functions slow down code on modern processors. As well as redundant CFLAGS such as specifying -fomit-frame-pointer on -O2 and above, because the GCC man page states that this is already implied.

Not to start another rant again, but actually reading the man (or info

) pages can help a lot too, and save time.
_________________
jabber: MighMoS@jabber.org

localhost # export HOME=`which heart`

Twist · Guru Joined: 03 Jan 2003 Posts: 414 Location: San Diego

MighMoS · Guru Joined: 24 Apr 2003 Posts: 416 Location: @ ~

ciaranm · Posted: Wed Dec 29, 2004 10:29 pm Post subject:

rhill · Posted: Wed Dec 29, 2004 11:21 pm Post subject:

thanks twist, i was getting all set to go into MythBusters mode, but you ranted for me. :lol:

seriously there needs to be a GCC Myths FAQ

ciaranm · Posted: Wed Dec 29, 2004 11:29 pm Post subject:

Twist · Guru Joined: 03 Jan 2003 Posts: 414 Location: San Diego

ciaranm · Posted: Thu Dec 30, 2004 12:12 am Post subject:

If you want stable, don't set CFLAGS at all in make.conf. Just rely upon the profile-provided settings. Gentoo developers are not here to correct every single possible stupid thing you can do with make.conf.

rhill · Posted: Thu Dec 30, 2004 1:19 am Post subject:

that kinda throws the whole 'freedom of choice' philosophy out the window though. sorry, just poking your buttons. :lol:

i do appreciate the all work you do here for us and gentoo in general.

seriously though, i was surprised that "-pipe" isn't on that whitelist. are there actually situations where -pipe needs to be filtered or has caused problems (just curious).
_________________
by design, by neglect
for a fact or just for effect

ciaranm · Posted: Thu Dec 30, 2004 1:21 am Post subject:

rhill · Posted: Thu Dec 30, 2004 1:25 am Post subject:

Hypnos · Posted: Thu Dec 30, 2004 4:54 am Post subject:

Twist,

Thanks for you work -- I'm glad someone has done something useful with my reporting scripts.

Comments:

* It seems that, apart from compilation problems, your Acovea "alt" CFLAGS did pretty well. This suggests that Acovea, for the algorithms you have chosen, has more reliably found negatives than affirmatives (apparently, the "maybe"'s from -O3 provided a big performance boost).

* The algorithms you have chosen are far more complex and heuristic than those employed by Acovea as benchmarks. On the former, this means that memory-intensive optimizations might be beneficial since you are moving a lot of data and burning a lot of cycles anyway. On the latter, I'm not knowledgeable enough to impute how this would affect the performance of specific switches ....

* Is not GCC optimization for AMD notoriously bad? As you say in another post, the cross-dependencies of the various switches might be too extensive for even Acovea to dissect with its evolution.

* Here are my CFLAGS for my P4-Mobile:

Twist · Guru Joined: 03 Jan 2003 Posts: 414 Location: San Diego

Hypnos,

BTW, before anything else, wanted to thank you for your ebuild and test scripts for Acovea. Fine work that I was too lazy to do myself.

moocha · Watchman Joined: 21 Oct 2003 Posts: 5722

Hypnos · Posted: Thu Dec 30, 2004 7:00 am Post subject:

dberkholz · Posted: Thu Dec 30, 2004 9:17 pm Post subject:

mbalino · n00b Joined: 09 Aug 2004 Posts: 30 Location: Edmonton

"-march=athlon-xp -m3dnow -msse -mfpmath=sse -mmmx -O3 -pipe -fforce-addr -fomit-frame-pointer -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -falign-functions=4 -maccumulate-outgoing-args -ffast-math -fprefetch-loop-arrays"

This my flags for BARTON 3000+ w/1024DDR400 SATA150 80Gb
KT600 / VT8237

All system are functionally since 15/11/2004 wtihout any problem.

kernel 2.6.9-ac12 and 2.6.10-ck1 are tested

hq4ever · Apprentice Joined: 15 Aug 2004 Posts: 167

Twist · Guru Joined: 03 Jan 2003 Posts: 414 Location: San Diego

procyon112 · Posted: Sat Apr 30, 2005 1:34 am Post subject: Invalid test

This test in invalid. Because you are evolving compile flags independently for each test, then accepting the ones that on average give you the best performance, the test is not even as good as:
1) start with no optimizations and run each program, taking a reading.
2) turn on an optimization, test, take a reading.
3) turn on a different optimization and test.
4) The optimizations that give benefits, use, the others drop.

The genetic algorithm is probably worse, because it does not do a comprehensive test, and takes MUCH longer. The GA test is supposed to show which flags work best IN TANDEM, so taking the best average results will probably result in worse performance than O2 or O3, which the gcc team has probably already tested for best average performance independantly. What you need to do is:
1) Only include in the list of flags to test, those which you will have no qualms using in your final system build, ie, leave out -malign-double
2) For each generation of the GA, *ALL* benchmarks are run and a rating is given to that "set" of flags as the GA fitness function
3) run the GA until you are satisfied with the overall results (since the set of flags is rather small as far as GA's are concerned, 20 generations should be good with a population of 50-100).
4) use ALL the flags of the winner GA on your system, because what you are testing is not "flag -fomg-fast is beneficial" but rather "flags -fsometimes-good -falmost-never -fduh-use-me-always and -mim-a-typewriter when used in tandem beats -O3 on average"

Basically, what I am saying, is that if you run 6 independant GA's then take the average results, your data is completely meaningless and you're better off sticking with the tried and true "-O2 -pipe". Rewrite this GA if you want to get real data out of it.

Hypnos · Posted: Sat Apr 30, 2005 4:16 am Post subject: Re: Invalid test