Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Gentoo/ GNU-Linux benchmark
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
jig
n00b
n00b


Joined: 15 Feb 2003
Posts: 4

PostPosted: Sun Oct 19, 2003 10:24 am    Post subject: Gentoo/ GNU-Linux benchmark Reply with quote

Hi all!

I have an idea but due to lack of time I wont be able to make it working, so I hope someone will, if it turns out a good idea.
Along time ago I was thinking that one of the most important things was missing in the developers world. "How faster will my program run with this gcc flag activated?"

[url] http://www.gentoo.org/main/en/performance.xml [/url] Not so long ago we had a comparison between gentoo and Mandrake 9.1 which,IMHO, could have give even faster results to gentoo if the author used other flags.

So, you are asking now "What's is idea anyway?"

Its simple.
Each gentoo user runs a serie of benchmarks (problem 1) and uploads the results to a site plus the CFLAGS.
The results then would be ordered and soon we would be able to have an enlightening result for each arch, and each user would be able to compare his results with the results of similar machines and/or different flags.
In a second (later) step we could benchmark the same programs Jose Lopez did.

(problem 1) My only problem would be to choose the right benchmarks to use... Probably some new benchmark tools would be needed to test specific parameters.

P.S.:Sorry about the english :-)
Back to top
View user's profile Send private message
HelloWorld82
n00b
n00b


Joined: 05 Oct 2003
Posts: 46
Location: Germany

PostPosted: Sun Oct 19, 2003 10:59 am    Post subject: Fair test ? Reply with quote

I think the perf test where isn't right.
http://www.gentoo.org/main/en/performance.xml

It's not to flame gentoo, Im also using it, and I like it a lot - the best way to have a linux distributioon that works right for u is : "do it yourself". But I thing Mandrake isn't so slow. 9.1 is an older disrtibution, u should compare gentoo witch mandrake 9.2. Also mozilla was older in mandrake 9.1, it's normal that is starts slowler . And there is also no indication with which flags mozilla has been compiled. I like Mandrake, because I disovered Linux because of them :-)
Back to top
View user's profile Send private message
The_Paya
Developer
Developer


Joined: 29 Aug 2003
Posts: 23
Location: Argentina

PostPosted: Wed Oct 29, 2003 2:40 am    Post subject: Reply with quote

As you asked me in the mailing list, I like you idea, and this was "a sort of that" idea, but I don't think that we should use "benchmark" programs for such a thing, because them not always do the same as real-world programs, so my idea is to use real-world programs (maybe configured in some way) to do this kind of benchmarks.

The benchmark I'll post here was done using povray (http://www.povray.org) following their specs to do benchmarks (using the benchmark.ini configuration, and the skyvase.pov file to render).

I choosed povray because this is a sort of rendering app that does lots of the three big things: memory i/o, integer and floating operations. it doesn't use the video card to render anything and this configuration (benchmark.ini) neither uses any kind of time-consuming output (just the screen for statistics).

Also, another thing to have in mind when choosing a program to benchmark: as most of you who know that every program in plain C/C++ (which is we want to optimize with our CFLAGS) uses nomatterwhat the system libc, may ask "yeah, but you have to compile the whole glibc again and again between your CFLAGS compilations", that's not so important if you choose the right program to do your bench. Povray does *a lot* of operations between variables and uses *a lot* of functions and subfuntions of these functions, to render the code, (look at a trace of the DNoise function) so -here- the compiler will have lots of choices to do a big optimization regardless if your glibc runs for i386 (of course it will run slower, but you will notice the differences between each mix of CFLAGS).

Last thing for a program to benchmark, if your program of choice uses another shared library that isn't the libc (povray uses libpng and libtiff, but for the output only, and here I have no output of any of these) and you still want to benchmark that program, remember that would be cool if every lib that the program needs/uses is compiled with the same CFLAGS, so you will notice extremely more differences between your tests.
Here it is (sorry, it's very long):

Code:

Commandline: "time nice -n -20 povray skyvase.pov" (using benchmark.ini)

 CFLAGS= -O3 -march=athlon-xp -fomit-frame-pointer
 real    0m3.156s
 user    0m2.996s
 sys     0m0.161s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -fomit-frame-pointer
 real    0m3.002s
 user    0m2.846s
 sys     0m0.157s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -finline-functions -fomit-frame-pointer   <- -O3 added
 real    0m3.197s
 user    0m3.039s
 sys     0m0.158s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -fomit-frame-pointer   <- -O3 added ! this is the fast one !

 real    0m2.993s
 user    0m2.834s
 sys     0m0.159s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -mpreferred-stack-boundary=2 \ <- slower ?
    -fomit-frame-pointer
 real    0m3.326s
 user    0m3.158s
 sys     0m0.168s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -mpreferred-stack-boundary=4 \ <- RTFM, implied default
    -fomit-frame-pointer
 real    0m2.996s
 user    0m2.834s
 sys     0m0.162s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -mpreferred-stack-boundary=8 \ <- I already RTFM, slower, ok.
    -fomit-frame-pointer
 real    0m3.021s
 user    0m2.860s
 sys     0m0.162s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \    <- I didn't added -mpreferred... bcos is implied
    -fomit-frame-pointer                   <- Now -malign-double FASTER!
 real    0m2.959s
 user    0m2.802s
 sys     0m0.158s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \    <- almost same as before, new flag implied
    -m96bit-long-double -fomit-frame-pointer      
 real    0m2.982s
 user    0m2.802s
 sys     0m0.181s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \    <- 128bit long double slower.
     -m128bit-long-double -fomit-frame-pointer   
 real    0m3.018s
 user    0m2.858s
 sys     0m0.161s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \   <- almost the same as without -mmx, implied?
    -mmmx -fomit-frame-pointer
 real    0m2.969s
 user    0m2.802s
 sys     0m0.167s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \   <- again, maybe implied?
     -mmx -msse -fomit-frame-pointer         
 real    0m2.965s
 user    0m2.803s
 sys     0m0.162s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double \   <- no noticable effect yet,
    -mmx -msse -m3dnow -fomit-frame-pointer            <- maybe implied?
 real    0m2.962s
 user    0m2.803s
 sys     0m0.159s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- what happens without mmx?
    -msse -m3dnow -fomit-frame-pointer            <- nothing :+/
 real    0m2.964s
 user    0m2.802s
 sys     0m0.162s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- and without sse?
    -m3dnow -fomit-frame-pointer               <- bah, nothing :+/
 real    0m2.974s
 user    0m2.805s
 sys     0m0.169s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- i was reading the info...
    -mno-push-args -fomit-frame-pointer            <- and I found this... not too much, and I don't like it :+P.
 real    0m2.972s
 user    0m2.804s
 sys     0m0.168s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- this implies the last one, well, see what happens..
    -maccumulate-outgoing-args -fomit-frame-pointer         <- faster, but bigger code size. (not a lot of space here)
 real    0m2.969s
 user    0m2.799s
 sys     0m0.170s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- Huh, faster, huh.
    -maccumulate-outgoing-args -mno-align-stringops \
    -fomit-frame-pointer
 real    0m2.948s
 user    0m2.781s
 sys     0m0.168s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- again i'm reading the info...
    -maccumulate-outgoing-args -mno-align-stringops \      <- 17ms slower. bah.
    -minline-all-stringops -fomit-frame-pointer               
 real    0m2.968s
 user    0m2.798s
 sys     0m0.170s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- -fforce-mem in -O2...
    -maccumulate-outgoing-args -mno-align-stringops \      <- what about -fforce-addr?
    -fforce-addr -fomit-frame-pointer            <- mbu. slower.
 real    0m3.132s
 user    0m2.970s
 sys     0m0.162s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- -fbranch-count-reg is enabled with -O2
    -maccumulate-outgoing-args -mno-align-stringops \      <- what happens disabling this?
    -fno-branch-count-reg -fomit-frame-pointer         <- uhm, it's enabled for a good reason (:+P)
 real    0m2.958s
 user    0m2.794s
 sys     0m0.164s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- slow like hell.
    -maccumulate-outgoing-args -mno-align-stringops \
    -fmove-all-movables -freduce-all-givs -freduce-all-givs -fomit-frame-pointer               
 real    0m3.198s
 user    0m3.038s
 sys     0m0.160s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- this one generates imprecise math code
    -maccumulate-outgoing-args -mno-align-stringops \      <- but not so imprecise ;+P
    -ffast-math -fomit-frame-pointer
 real    0m3.043s
 user    0m2.881s
 sys     0m0.162s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- let's play with -fpmath
    -maccumulate-outgoing-args -mno-align-stringops \      <- sse: slower
    -fpmath=sse -fomit-frame-pointer
 real    0m3.048s
 user    0m2.890s
 sys     0m0.158s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- 387:
    -maccumulate-outgoing-args -mno-align-stringops \      <- mmm, better...
    -fpmath=387 -fomit-frame-pointer
 real    0m2.952s
 user    0m2.788s
 sys     0m0.164s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- sse,387:
    -maccumulate-outgoing-args -mno-align-stringops \      <- b00, slower...
    -fpmath=sse,387 -fomit-frame-pointer
 real    0m3.104s
 user    0m2.941s
 sys     0m0.163s
 *********************************************************************
 ************************branch probabilities*****************************
 *********************************************************************
 This is the end of the CFLAGS that gentoo can take, the following works in this way:
 You first compile a program with -fprofile-arcs, then run the program a while. When
 you do this, the program runs slower than hell, but don't worry, it's creating
 information at the side of your already compiled code about branch probabilities,
 (without this GCC does random branch prediction, with this GCC is writing the branch
 flow to a .da file (with the same name of the .c/.o file that it's being executed, so
 DON'T delete your directory with the source code)
 After -fprofile-arcs, and running the compiled program, you have to recompile it again
 with -fbranch-probabilities, and the compiler will get branch data from the already
 generated .da files and make the code run in the directions of the most commonly,
 and time consuming, code. Just looks what happens:
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- now, the real part. profiling.
    -maccumulate-outgoing-args -mno-align-stringops \      <- first we compile with -fprofile-arcs
    -fpmath=387 -fprofile-arcs -fomit-frame-pointer            <- (compile with -p and use gprof to see nice stats)
 real    0m4.048s
 user    0m3.882s
 sys     0m0.166s
 *********************************************************************
 CFLAGS= -O2 -march=athlon-xp -frename-registers -malign-double  \    <- now, gcc is using the profiled data
    -maccumulate-outgoing-args -mno-align-stringops \      <- what can be faster than this?? :+)
    -fpmath=387 -fbranch-probabilities -fomit-frame-pointer            
 real    0m2.900s
 user    0m2.733s
 sys     0m0.167s
 *********************************************************************

_________________
wherever you go, there you are.
Back to top
View user's profile Send private message
fsck!
n00b
n00b


Joined: 24 Oct 2003
Posts: 29

PostPosted: Thu Oct 30, 2003 9:13 pm    Post subject: Try this as a benchmark Reply with quote

Might I recommend the BYTE Benchmark? Just saw it yesterday. Seems quit comprehensive (good dagree of different tests).



http://forums.gentoo.org/viewtopic.php?t=93250&highlight=unix+benchmark&sid=889a0c7ffaf5fab54152346260601735


Runs very nice on my Gentoo install. 8)
Back to top
View user's profile Send private message
The_Paya
Developer
Developer


Joined: 29 Aug 2003
Posts: 23
Location: Argentina

PostPosted: Thu Oct 30, 2003 9:41 pm    Post subject: Reply with quote

Hi, I saw every benchmark app in portage, and as before I was using gentoo, I don't like benchmarking programs, they don't think in "real world" situations, (like povray rendering) they always run each test in a singular function, so the compiler will never do a good thing about aligning functions or guessing branch probabilities, etc. That's why I preffer real world applications to do this benchmarks, you can test even The Gimp to do so, the problem is that you don't have a way tu run it without depending of the X output, another one is "sed" (which is used a lot in a unix enviroment) or "grep" or whatever you want, and "time" it to see how it takes to process a regular expression, after you compiled it with your test CFLAGS.

And what I really mean with this is that the benchmarking apps are more oriented at "what hardware runs me faster" rather than at "what CFLAGS compiles me better and makes me run faster".

Salu2.
_________________
wherever you go, there you are.
Back to top
View user's profile Send private message
The_Paya
Developer
Developer


Joined: 29 Aug 2003
Posts: 23
Location: Argentina

PostPosted: Fri Nov 26, 2004 4:43 am    Post subject: mbump Reply with quote

First trying to revive the "benchmark these CFLAGS in a real-app with nice -n -20" idea i had writing on this thread, and second showing some success on my work and looking "not to do the same thing that someone else maybe already did...", I'm posting this "experience" and a little question.

I where working with linux in various places over the time but all of those "places" had policies and stuff about what kind of linux distribution they'll use (or not use at all) so the only thing I could do with gentoo was "I use it only on my machine". But now, I work in a complete mess of an extremely bad directed "gouvernamental institution", one of those that makes use of -INTENSIVE- processing power, but, looking for the minimal-lowest-nonexistant "cost".

Sounds funny, right? :+)

Well here is when gentoo makes my life easier:
The "guy before me" was a fan of OpenBSD which I respect and still use as our "internet" firewalls, but it really lacks of the support that everyone else needed at work. A little example of this: They where working with databases in "a couple" of RDBMSs: Gupta/Centura SQLBase, Oracle, Interbase, among others. So... "how in the hell i can make a apache+mysql+php-THING work with these databases on OBSD". just "no way" so before starting the real "web services" they asked me to prove myself "migrating" some "non-cost-effective" M$ services, as I did:

The hardware they have are some Fujistu-Siemens Primergy P470 servers, with duals PentiumII or PentiumIII with no more than 500Mhz the P3 and 450Mhz the P2, if I wanted 1Gb of ram they had this but wasn't neccesary, and the only good thing i liked about these where the hot-pluggable backplanes with Mylex DAC960 and DAC1100 SCSI-2 Raid controllers.

So I took my livecd and migrated 2 M$ systems and an old redhat: First was the "internet S and A" server (ISA :+P) to squid->winbind/samba->w2kdomaingroups. Then was the mailserver, which was a qmail installed from some sort of strange script that downloads and compiles and installs and configured all by himself without asking a sh*t, which had more than 3000 viruses incoming per week, now it's a postfix+mysql+amavis+clamav+dspam solution made by myself (of course following TONS of howtos :+P). And the last one was the webserver itself, which was an "it need to be restarted every 4 hours" win2k with apache and php,and now it can acces ANY database that have an ODBC driver even in WINDOWS (using a DBTCP proxy in a win2k machine google it, it's cool, i also patched the php module to compile and added a pear module to access it from Pear DB.php ;+) from php using apache and accessing oracle 8.1.7 and oracle 9.2.0.5.

So far, I didn't needed "great" optimization. CFLAGS like "-O2 -march=pentium3/2 -frename-registers -fomit-frame-pointers -pipe" did the magic very well.

But now "the time has come" X'D

I, now, have to make -THE- database server.

I already started it, but with -extra-safe-cflags-and-package-versions-, because it should run Oracle 9.2.0.5 and you can find a lot of "troubles" rather than "solutions" in non-certified (and in certified too btw) distributions where to run oracle.

Now after 3 weeks of recovering dead disks, going for the S40 storage, and compiling/downloading/installing everything to the tinyest detail....
it's working...

This is what i've got:

The computer have two PentiumIII 500Mhz with 1Gb of ECC RAM, and a total of 14 SCSI-2 hard disks (yes 14), 2 of them (9Gb each) in a raid1 (mirror) configuration for the booting system (gentoo ;+) with reiserfs on a DAC960PRL controller.
The other 12 are configured in this way: since the DAC1100 that connects the server to the storage cannot handle in a single array pack more than 8 disks this storage has 2 RAID5 configurations one made of 8 9Gb disks and the other made of 4 18Gb disks, which I combined in three different "partitions" using software raid0 (stripping) with a total of (around) 110Gb, divided into 60Gb/30Gb/20Gb (oracle users my guess this u01/u02/u03 :+) that holds the big swap partition (2Gb outside the softraid) and, of course, will hold the database. Since this is going to handle "big" files, it has XFS tuned accordingly with Daniel Robbins suggestions ;+).

Now, regarding the original subject, I'm thinking about doing an emerge -e world with new cflags, check if it breaks anything, if it works faster, if I can extract all the juice this machines have, so I'm about to start "benchmarking"

But, taking in mind the -fact- that this wont recompile the oracle stuff (and oracle doesn't compile anything on your machine, it just "links" stuff) I was thinking about "forcing" some cflags to the glibc itself and other libs that may oracle use that can be optimized "the gentoo way".

After this, I'm asking before I start this "adventure" X'D, if anyone, anybody did something similar before, results, advices, CFLAGS!!!, even ~x86 stuff like gcc3.4 if it doesn't break oracle ;+), kernel schedulers (staircase?, cfq?, and io: anticipatory?, deadline?), kernel patches, anything (ideas, insults, flames X'D) can be of help and will be appreciated.

(I may add that this is a sort of "competition" between me and the DBA, that says: "the server must be a redhat 8, since oracle has certified it" and I say: "the server must be tested first with gentoo since it will run faster and we need speed on that old machines we have"....
So this will be (in the end) a benchmark with a title like this "Indestructible Redhat-Oracle" vs "Fast-Bleeding-Edge Gentoo-Oracle".)

So I hope the enthusiasm from all the gentoo users can help me win this battle ;+)

Thanks a lot for reading all of this, I know is very large, but I thought it would be nice to make it known that gentoo is growing in scales like this :+)

Salu2,
Javier.
_________________
wherever you go, there you are.
Back to top
View user's profile Send private message
nxsty
Veteran
Veteran


Joined: 23 Jun 2004
Posts: 1556
Location: .se

PostPosted: Sun Nov 28, 2004 2:30 pm    Post subject: Re: mbump Reply with quote

The_Paya wrote:
After this, I'm asking before I start this "adventure" X'D, if anyone, anybody did something similar before, results, advices, CFLAGS!!!, even ~x86 stuff like gcc3.4 if it doesn't break oracle ;+), kernel schedulers (staircase?, cfq?, and io: anticipatory?, deadline?), kernel patches, anything (ideas, insults, flames X'D) can be of help and will be appreciated.


Staircase is a CPU-sheduler and the others are IO-schedulers so it's completly different things. I think you should try staircase, it's the best performing CPU-scheduler available today and it's faster than the standard 0(1) in almost any situation.

You can read about the IO-schedulers here:
/usr/src/linux/Documentation/block
but I think deadline is probably the one you want because it's supposed to be good at database loads.

And use NPTL instead of LinuxThreads. NPTL is much faster!
Back to top
View user's profile Send private message
The_Paya
Developer
Developer


Joined: 29 Aug 2003
Posts: 23
Location: Argentina

PostPosted: Fri Dec 10, 2004 3:28 am    Post subject: Reply with quote

First of all, thanks for the reply. The server is running now with staircase scheduler and the deadline io scheduler, I don't think that oracle is going to support NPTL, have anyone did a test with this?
_________________
wherever you go, there you are.
Back to top
View user's profile Send private message
asimon
l33t
l33t


Joined: 27 Jun 2002
Posts: 979
Location: Germany, Old Europe

PostPosted: Fri Dec 10, 2004 10:30 am    Post subject: Reply with quote

The_Paya wrote:
I don't think that oracle is going to support NPTL, have anyone did a test with this?


No tested but a google search indicates that 10g does support NPTL and works without exporting LD_ASSUME_KERNEL=2.4.1.
Back to top
View user's profile Send private message
The_Paya
Developer
Developer


Joined: 29 Aug 2003
Posts: 23
Location: Argentina

PostPosted: Mon Dec 13, 2004 10:53 am    Post subject: Reply with quote

we're currently using Oracle 9.2i, bcos of compatibility issues with newer versions :+/
_________________
wherever you go, there you are.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum