| View previous topic :: View next topic |
| Author |
Message |
d99ma Tux's lil' helper

Joined: 21 Jul 2003 Posts: 148 Location: Lund, Sweden
|
Posted: Thu Sep 07, 2006 9:55 pm Post subject: Core 2 Duo - Merom |
|
|
Hello,
I've just placed an order for a new laptop and it comes with the Core 2 Duo cpu - Merom. What I can gather from the threads I've found people either recomends using march=nocona or pentium-m with msse3 for the conroe desktop version. Which one is the best choice for a laptop merom CPU?
Thanks
/Martin |
|
| Back to top |
|
 |
Lloeki Guru


Joined: 14 Jun 2006 Posts: 437 Location: France
|
Posted: Fri Sep 08, 2006 8:38 am Post subject: |
|
|
depends on whether architecturally merom is closer to nocona (desktop) or to banias/dothan (pentium m).
I bet the latter, as pentium m were closer to pIII than to their pIV desktop counterparts. _________________ Moved to using Arch Linux
Life is meant to be lived, not given up...
HOLY COW I'M TOTALLY GOING SO FAST OH F***  |
|
| Back to top |
|
 |
alechiko Guru


Joined: 01 Feb 2004 Posts: 461 Location: Inside piano, do not disturb.
|
Posted: Sun Sep 10, 2006 3:02 am Post subject: |
|
|
| Lloeki wrote: | depends on whether architecturally merom is closer to nocona (desktop) or to banias/dothan (pentium m).
I bet the latter, as pentium m were closer to pIII than to their pIV desktop counterparts. |
Um... Ja but isn't the merom 64bit? Surely P-M/P-3 flags are going to ignore that 64bit goodness. _________________ alechiko |
|
| Back to top |
|
 |
Lloeki Guru


Joined: 14 Jun 2006 Posts: 437 Location: France
|
Posted: Sun Sep 10, 2006 8:07 am Post subject: |
|
|
maybe add -m64 ? really, I don't know, I'm not that proficient in 64bit. _________________ Moved to using Arch Linux
Life is meant to be lived, not given up...
HOLY COW I'M TOTALLY GOING SO FAST OH F***  |
|
| Back to top |
|
 |
d99ma Tux's lil' helper

Joined: 21 Jul 2003 Posts: 148 Location: Lund, Sweden
|
Posted: Mon Sep 11, 2006 12:16 pm Post subject: |
|
|
64bits is another question.
Is there anyone running 64-bits linux on the merom?
Is the small? performance gain worth eventual problems?
/Martin |
|
| Back to top |
|
 |
dirtyepic Developer


Joined: 22 Oct 2004 Posts: 1614 Location: sk.ca
|
Posted: Tue Sep 12, 2006 5:49 am Post subject: |
|
|
binutils newer than 2.16.91.0.7 actually support -march/tune=merom. However, the required GCC bits [i], which also add -mmni (Merom New Instructions) haven't gone in yet, and won't until 4.3 opens for development [ii].
in the meantime, the IA-32 Intel Architecture Optimization Reference Manual definitely seems to put Core and Core 2 into the Pentium-M family. for 64 bit, i'm not sure if -march=pentium-m -msse3 -m64 will do the job or not. i do know that the amd64 profile will spit a big freakin annoying red warning and pause for 5 seconds every time you try emerge something with -m64 in your CFLAGS [iii]. it's possible that the amd64 profile automatically adds -m64 for you, but you'll have to check for yourself since my em64t box is booted into x86 at the moment.
[i] http://gcc.gnu.org/ml/gcc-patches/2006-02/msg01866.html
[ii] http://gcc.gnu.org/ml/gcc-patches/2006-07/msg00605.html
[iii] a properly motivated person might create a executable script in /etc/portage/postsync.d containing "rm /usr/portage/profiles/default-linux/amd64/profile.bash" (or even a sed line removing a certain flag from $BAD_FLAGS if they still wanted the "filter invalid or nonexistent flags" functionality). this of course would be highly unsupported.  _________________ by design, by neglect
for a fact or just for effect |
|
| Back to top |
|
 |
d99ma Tux's lil' helper

Joined: 21 Jul 2003 Posts: 148 Location: Lund, Sweden
|
Posted: Wed Sep 13, 2006 3:31 pm Post subject: |
|
|
dirtyepic, thanks for the insight!
I will probably stick to 32bits and wait for proper merom support from gcc before switching to 64 bits. |
|
| Back to top |
|
 |
darkphader Veteran


Joined: 09 May 2002 Posts: 1054 Location: Motown
|
Posted: Wed Sep 13, 2006 4:00 pm Post subject: |
|
|
| dirtyepic wrote: | | in the meantime, the IA-32 Intel Architecture Optimization Reference Manual definitely seems to put Core and Core 2 into the Pentium-M family. for 64 bit, i'm not sure if -march=pentium-m -msse3 -m64 will do the job or not. i do know that the amd64 profile will spit a big freakin annoying red warning and pause for 5 seconds every time you try emerge something with -m64 in your CFLAGS [iii]. :evil: it's possible that the amd64 profile automatically adds -m64 for you, but you'll have to check for yourself since my em64t box is booted into x86 at the moment. |
Other posts I've read, plus the installation handbook state that -march=nocona is proper for EM64T users.
Sorry, don't really know if/why that info is valid for the merom.
Chris _________________ To our sweethearts and wives. May they never meet. |
|
| Back to top |
|
 |
dirtyepic Developer


Joined: 22 Oct 2004 Posts: 1614 Location: sk.ca
|
Posted: Sun Sep 17, 2006 2:12 am Post subject: |
|
|
to add one more thing to the confusion, i just stumbled over this:
| Quote: | > So, this person has a pentium m, and /proc/cpu info says his processor
> belongs to family 6... as you can see, mine also belongs to family 6...
> so even though the wiki stages march=prescott, what do you guys think?
>
-march=pentium-m prefers x87 over sse scalar code, because pentium-m can
decode sse at only half the rate of x87. You should see the speed
advantage clearly on pentium-m, presumably not on Core Duo. |
http://article.gmane.org/gmane.comp.gcc.help/15506
And SSE is definitely the win on Core chips:
| Quote: | Core Duo Processors
On Intel Core Solo and Intel Core Duo processors, the combination of
improved decoding and micro-op fusion allows instructions which were
formerly two, three, and four micro-ops to go through all decoders. As a
result, scalar SSE/SSE2 code can match the performance of x87 code
executing through two floating-point units. On Pentium M processors,
scalar SSE/SSE2 code can experience approximately 30% performance
degradation relative to x87 code executing through two floating-point
units.
In code sequences that have conversions from floating-point to integer,
divide single-precision instructions, or any precision change; x87 code
generation from a compiler typically writes data to memory in
single-precision and reads it again in order to reduce precision. Using
SSE/SSE2 scalar code instead of x87 code can generate a large
performance benefit using Intel NetBurst microarchitecture and a
modest benefit on Intel Core Solo and Intel Core Duo processors.
Recommendation: Use the compiler switch to generate SSE2 scalar
floating-point code over x87 code. |
http://www.intel.com/design/pentium4/manuals/index_new.htm
i think the switch mentioned would be -mfpmath=sse _________________ by design, by neglect
for a fact or just for effect |
|
| Back to top |
|
 |
irondog l33t


Joined: 07 Jul 2003 Posts: 715 Location: Voor mijn TV. Achter mijn pc.
|
Posted: Sun Sep 17, 2006 2:23 pm Post subject: |
|
|
Dirtyepic, should Core 2 Duo users use mtune=merom if it would already be available? Or is "nocona" more appropriate in some situations? _________________ Alle dingen moeten onzin zijn. |
|
| Back to top |
|
 |
ECantona n00b

Joined: 26 Apr 2005 Posts: 65
|
Posted: Wed Sep 20, 2006 6:37 pm Post subject: |
|
|
Finally, which cflags can we safely use with a merom processor? I'm a bit confused
and what about -march prescott? |
|
| Back to top |
|
 |
ECantona n00b

Joined: 26 Apr 2005 Posts: 65
|
Posted: Wed Sep 20, 2006 9:18 pm Post subject: |
|
|
| and also, which processor family should we choose in kernel configuration for a merom processor? |
|
| Back to top |
|
 |
Lloeki Guru


Joined: 14 Jun 2006 Posts: 437 Location: France
|
Posted: Thu Sep 21, 2006 7:31 am Post subject: |
|
|
| Quote: | Or is "nocona" more appropriate in some situations?
and what about -march prescott? |
damn, how much has this to be said?
nocona and prescott are architecturally totally different to merom.
this is like using pentium4 for pentium-m, you will lose performance. things forked at pentium3. so, when pentium-m didn't exist, pentium3 was the flag to use. the closest thing to merom is core duo (which is 2 pentium-m cores).
you need sse3? add -sse3.
you really want 64bit? add -m64 to cflags. gcc manual:
| Quote: |
-m32
-m64
Generate code for a 32-bit or 64-bit environment. The 32-bit environment sets int, long and pointer to 32 bits and generates code
that runs on any i386 system. The 64-bit environment sets int to 32 bits and long and pointer to 64 bits and generates code for
AMD's x86-64 architecture. |
don't be fooled by 'AMD', this is really the flag to use. EM64T and AMD64 are (mostly) compatible, and different than IA64 (Xeon).
I'll receive my merom in 2 to 5 days, and anyway, I could care less about 64 bits. screw up videogames console marketing, 64bit anything but like 2*32bit: 64 bit is a tad slower, takes a tad to a whole more space (e.g the L2 cache will be twice as filled in 64bit mode than in 32bit, so a 64bit 4MB cache is effectively like a 32bit 2MB cache), and I don't need the extra precision (which is in the end, un-precision, as x87 computes on 80bit, while 64bit instructions (like sse) compute on... 64bit) and pointing ability (I have 'only' 1Gb ram). and I'll save myself some chroot/emul-linux/32v64 binary headaches (yes, I do use closed source, and I need them).
see ya in 2038. till then, 64 bit is just 'try, and adapt before convert'. _________________ Moved to using Arch Linux
Life is meant to be lived, not given up...
HOLY COW I'M TOTALLY GOING SO FAST OH F***  |
|
| Back to top |
|
 |
Mad Merlin Veteran

Joined: 09 May 2005 Posts: 1066
|
Posted: Thu Sep 21, 2006 2:17 pm Post subject: |
|
|
| Lloeki wrote: |
| Quote: |
-m32
-m64
Generate code for a 32-bit or 64-bit environment. The 32-bit environment sets int, long and pointer to 32 bits and generates code
that runs on any i386 system. The 64-bit environment sets int to 32 bits and long and pointer to 64 bits and generates code for
AMD's x86-64 architecture. |
don't be fooled by 'AMD', this is really the flag to use. EM64T and AMD64 are (mostly) compatible, and different than IA64 (Xeon).
|
Actually, IA64 [1] is Itanium (2), not Xeon [2]. Xeon:Pentium::Opteron:Athlon.
[1] http://en.wikipedia.org/wiki/IA-64
[2] http://en.wikipedia.org/wiki/Xeon _________________ Game! - Where the stick is mightier than the sword! |
|
| Back to top |
|
 |
Lloeki Guru


Joined: 14 Jun 2006 Posts: 437 Location: France
|
Posted: Thu Sep 21, 2006 8:39 pm Post subject: |
|
|
Mad Merlin, thanks for the correction  _________________ Moved to using Arch Linux
Life is meant to be lived, not given up...
HOLY COW I'M TOTALLY GOING SO FAST OH F***  |
|
| Back to top |
|
 |
lcj n00b


Joined: 25 Apr 2004 Posts: 74 Location: Opole, Poland
|
Posted: Fri Sep 22, 2006 1:03 pm Post subject: |
|
|
I'm using on Core Duo T7200 following flags: | Code: | | CFLAGS="-O2 -march=nocona -mtune=nocona -msse3 -mfpmath=sse -pipe -fomit-frame-pointer" | . I've moved from pentium-m day ago, and currently I'm running recompiled XGL and firefox, with no problems. But 2.6.18 kernel fails randomly during boot when disk access is heavy, I need to research that... Not sure if it's related to flags _________________ --
Lukasz C. Jokiel via web |
|
| Back to top |
|
 |
Lloeki Guru


Joined: 14 Jun 2006 Posts: 437 Location: France
|
Posted: Fri Sep 22, 2006 2:57 pm Post subject: |
|
|
of course you won't encounter any obvious problem (crash, 'illegal instruction', etc...). what you may encounter is performance problems.
pipeline length, l1 cache handling, design philosophy, etc... see here (and links) why merom is closer to a pentium-m than to a nocona. and of course, you won't get 100% out of your merom with a -march=mentium-m, but where you'd get 80% with p-m, you will only get 40% with nocona (dummy figures, but hey, expect anything near with a "pipeline [...] less than half of Prescott's"). you'd better use p3 altogether.
anyway, what we're talking about is microsecond improvement, so you won't see much difference in the end, and as you will eventually rebuild everything once march=merom is out, you should wait altogether.
as a side note, mtune is redundant, as march implies enhancements of mtune, plus specifics. I don't know what precedence gcc gives to each one, but it may as well disable your glorious march optimisations, in favor of safer mtune ones. _________________ Moved to using Arch Linux
Life is meant to be lived, not given up...
HOLY COW I'M TOTALLY GOING SO FAST OH F***  |
|
| Back to top |
|
 |
lcj n00b


Joined: 25 Apr 2004 Posts: 74 Location: Opole, Poland
|
Posted: Fri Sep 22, 2006 6:36 pm Post subject: |
|
|
So to sum-up I'd need to run some benchmarks to make sure that pentium-m is better for the time being than nocona... _________________ --
Lukasz C. Jokiel via web |
|
| Back to top |
|
 |
Lloeki Guru


Joined: 14 Jun 2006 Posts: 437 Location: France
|
Posted: Sat Sep 23, 2006 9:24 am Post subject: |
|
|
no, pentium-m (predecessor of Core arch) has 98% chances of being faster than nocona (netburst arch).
from link:
| Quote: | | Intel has replaced NetBurst with the Intel Core microarchitecture, released in July 2006, which is more directly derived from 1995's Pentium Pro than it is from NetBurst. |
_________________ Moved to using Arch Linux
Life is meant to be lived, not given up...
HOLY COW I'M TOTALLY GOING SO FAST OH F***  |
|
| Back to top |
|
 |
lcj n00b


Joined: 25 Apr 2004 Posts: 74 Location: Opole, Poland
|
Posted: Sat Sep 23, 2006 10:42 am Post subject: |
|
|
Hmmm... I've compiled with gcc (4.1.1, unstable Gentoo) gimp with tune nocona and then with only pentium-m flags and frankly it looks like saving 4096x4096x24 PNG file is rather faster with nocona switch than it is with pentium-m for Core 2 Duo T7200. Judging from your discussion here I expected rather the oposite. _________________ --
Lukasz C. Jokiel via web |
|
| Back to top |
|
 |
Lloeki Guru


Joined: 14 Jun 2006 Posts: 437 Location: France
|
Posted: Sat Sep 23, 2006 5:44 pm Post subject: |
|
|
I fail to see how writing a file to disk can be a benchmark of cpu performance.
plus in this case it certainly relies on at least gtk and glibc, and maybe some libs for png conversion, so these should have to be rebuilt too.
benchmarking such things are really not easy. at all. _________________ Moved to using Arch Linux
Life is meant to be lived, not given up...
HOLY COW I'M TOTALLY GOING SO FAST OH F***  |
|
| Back to top |
|
 |
lcj n00b


Joined: 25 Apr 2004 Posts: 74 Location: Opole, Poland
|
Posted: Sat Sep 23, 2006 5:55 pm Post subject: |
|
|
Well, given the fact tha the file is buffered completly (no actual disk access), it's just pure CPU power used to compress bitmap. Sure benchmarking is not easy, but since the difference is noticable I need to check kernel compilation times. Anybody else doing such experiments ? _________________ --
Lukasz C. Jokiel via web |
|
| Back to top |
|
 |
dirtyepic Developer


Joined: 22 Oct 2004 Posts: 1614 Location: sk.ca
|
Posted: Sun Sep 24, 2006 3:11 am Post subject: |
|
|
| Lloeki wrote: | | this is like using pentium4 for pentium-m, you will lose performance. things forked at pentium3. so, when pentium-m didn't exist, pentium3 was the flag to use. the closest thing to merom is core duo (which is 2 pentium-m cores). |
first, please post some numbers to back up your statements. second, you're missing the big picture. it's NOT a pentium-m microarch. they didn't "fork" anything. it's similar in design philosophy, and shares a lot in common with that CPU. but there are major differences. see above for just a few examples. i personally don't know one way or another. i've asked on the gcc mailing list but haven't received a reply. i use -march=prescott, others can use whatever they want, but i refuse to recommend anything without seeing the numbers first.
| Quote: | | you really want 64bit? add -m64 to cflags. |
no, you can't do that without running a 64bit multilib portage profile. you need certain libraries to be 32bit and others to be 64bit. forcing -m64 will break things. it may be possible to run an amd64 profile with -march=pentium-m, but i really don't play with the amd64 toolchain enough to know.
| Quote: | | I'll receive my merom in 2 to 5 days, and anyway, I could care less about 64 bits. screw up videogames console marketing, 64bit anything but like 2*32bit: 64 bit is a tad slower, takes a tad to a whole more space (e.g the L2 cache will be twice as filled in 64bit mode than in 32bit, so a 64bit 4MB cache is effectively like a 32bit 2MB cache), and I don't need the extra precision (which is in the end, un-precision, as x87 computes on 80bit, while 64bit instructions (like sse) compute on... 64bit) and pointing ability (I have 'only' 1Gb ram). |
huh? _________________ by design, by neglect
for a fact or just for effect |
|
| Back to top |
|
 |
Lloeki Guru


Joined: 14 Jun 2006 Posts: 437 Location: France
|
Posted: Sun Sep 24, 2006 10:24 am Post subject: |
|
|
first things first, I never intended to provide the Absolute Truth About Everything. I gather elements which I find relevant and expose them for discussion, and readily accept any correction
if by numbers you mean benchmarks, I can't provide benchmarks because:
| Quote: | | benchmarking such things are really not easy. at all. |
and
| Quote: | | I'll receive my merom in 2 to 5 days |
as numbers, the sole numbers I have are in the links provided:
| Quote: | | Core's execution unit is 4-issues wide, compared to the 3-issue cores of P6, P6-M (Banias, Dothan, and Yonah), and NetBurst microarchitectures |
so p-m and prescott are even here, and optimized code for this won't be generated until merom arrives. I'm uncertain if gcc optimizes code for such a feature.
| Quote: | | The pipeline is 14 stages long | vs | Quote: | | The Prescott achitecture, the last core of the Pentium 4, has a 31 stage pipeline |
optimizing code for a 31 pipeline and feeding it to a 14 pipeline is certainly insane throughput-wise. pipeline techniques like predictive branching will certainly be affected. if I'm not mistaken, gcc does that kind of code optimization.
| Quote: | | The Prescott was produced [...] addition of an even larger cache (from 512KB in the Northwood to 1MB, and later 2MB) |
merom has 4mb, so there will be a net loss here. if I'm not mistaken, gcc does that kind of code optimization too.
| Quote: | | no, you can't do that without running a 64bit multilib portage profile. |
of course you don't the point was to expose how to 'manually' generate EM64T instructions without -march.
| Quote: | | but there are major differences |
you're right, the biggest one being the presence of two cores, and linked-l1/shared-l2 cache handling, which by itself justifies a new march.
64bit interest is in:
- computing twice the precision at same speed
- handling and addressing long long directly
this has the advantage of:
- handling >4Gb ram efficiently
- handling >1Gb ram very efficiently
- handling big files
- number-crunching apps, where higher precision will come at no performance cost
- save us from a 2038 blackout
64bit drawback is:
- it takes twice more space as 32bit
which has implications (not necessarily in 2* order) on generated code size, l1/l2 cache usage, ram usage, and so on. lots of (more or les arguable) benchmarks are available. I read a very accurate one in that amss but I can't find it anymore.
so what I meant is, for now, I'll play around with 64bit, but I'll install and run a 32bit gentoo.
again, that's what I gathered from the net, mixed with personal knowledge, and concluded. I readily accept any constructive critics, I am happy to learn always more  _________________ Moved to using Arch Linux
Life is meant to be lived, not given up...
HOLY COW I'M TOTALLY GOING SO FAST OH F***  |
|
| Back to top |
|
 |
xentric Guru


Joined: 16 Mar 2003 Posts: 410 Location: Netherlands
|
Posted: Sun Sep 24, 2006 12:15 pm Post subject: |
|
|
I have the E6300 Core2 Duo (Allendale) in my system.
What's best to be used as "Processor Family" when configuring my kernel, Pentium-M or Pentium-4?
And does this processor support "CPU frequency scaling" with Intel Enhanced Speedstep or Intel Pentium-4 clock modulation? _________________ When all else fails, read the manual...
Registered Linux User #340626
Last edited by xentric on Sat Sep 30, 2006 12:09 am; edited 1 time in total |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|