| View previous topic :: View next topic |
| Author |
Message |
d2_racing Moderator


Joined: 25 Apr 2005 Posts: 12867 Location: Ste-Foy,Canada
|
Posted: Sun Mar 22, 2009 12:22 am Post subject: |
|
|
Hi, if you use Gcc 4.3.3 then you should enable this feature  _________________ Sysadmin of Funtoo-Québec.org
Wiki
Signature
IRC on Freenode : #funtoo-quebec |
|
| Back to top |
|
 |
MaximeG l33t

Joined: 15 Apr 2008 Posts: 722 Location: Belgium
|
Posted: Sun Mar 22, 2009 10:19 am Post subject: |
|
|
Hi,
But in the case of an i7, isn't it already activated when using march=nathive ?
Thanks,
Maxime _________________ Future is wide open. |
|
| Back to top |
|
 |
d2_racing Moderator


Joined: 25 Apr 2005 Posts: 12867 Location: Ste-Foy,Canada
|
Posted: Sun Mar 22, 2009 1:38 pm Post subject: |
|
|
I don't think so, because GCC 4.4.3 was release earlier then the CoreI7.
So you need to add it manually I think. _________________ Sysadmin of Funtoo-Québec.org
Wiki
Signature
IRC on Freenode : #funtoo-quebec |
|
| Back to top |
|
 |
MartyMcFly n00b

Joined: 25 Apr 2007 Posts: 25
|
Posted: Sun Mar 22, 2009 6:17 pm Post subject: |
|
|
Confirmed, this is working flawlessly in gcc-4.3.3-r1 (I've used http://www.gentoo.org/doc/en/gcc-upgrading.xml to upgrade gcc)
CHOST="i686-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=native -msse4 -fomit-frame-pointer"
CXXFLAGS="${CFLAGS}"
I've updated the wiki entry accordingly. |
|
| Back to top |
|
 |
jasn Apprentice


Joined: 05 May 2005 Posts: 275 Location: Maryland, US
|
Posted: Thu Aug 27, 2009 6:34 am Post subject: |
|
|
I jot got my Clevo D900F with an i7-975 (3.33ghz) cpu and 6gb of DDR3 RAM, and I'm editing my post since I feel that the information I previously posted was inaccurate. As gringo points out in the next post in this thread, these are the CFLAGS I ended up using for my system;
| Code: | | CFLAGS="-march=native -O2 -pipe" |
In the end, my trying to tweak them seemed to have no overall effect for my system.
Good Luck..
Last edited by jasn on Mon Aug 31, 2009 2:27 am; edited 2 times in total |
|
| Back to top |
|
 |
gringo Advocate


Joined: 27 Apr 2003 Posts: 3505
|
Posted: Thu Aug 27, 2009 8:14 am Post subject: |
|
|
| Quote: | | CFLAGS="-march=native -msse4 -msse4.1 -msse4.2 -mcx16 -msahf -O2 -pipe" |
if you use -march=native together with -O2, -msse4 -msse4.1 -msse4.2 -mcx16 -msahf (among others) are enabled by default for i7, no need to add them.
cheers _________________ Error: Failing not supported by current locale |
|
| Back to top |
|
 |
d2_racing Moderator


Joined: 25 Apr 2005 Posts: 12867 Location: Ste-Foy,Canada
|
Posted: Thu Aug 27, 2009 12:22 pm Post subject: |
|
|
In fact, but with core2 maybe you need to add them manually. _________________ Sysadmin of Funtoo-Québec.org
Wiki
Signature
IRC on Freenode : #funtoo-quebec |
|
| Back to top |
|
 |
kernelOfTruth Watchman


Joined: 20 Dec 2005 Posts: 5345 Location: Vienna, Austria; Germany; hello world :)
|
Posted: Thu Mar 25, 2010 3:04 pm Post subject: |
|
|
*subscribes*
mtune=barcelona is mentioned some times over at intel developer forums and also when comparing icc sse4.2x with gcc's sse4.2 capabilities and optimizations for core i7
no-one tried it ?
especially interesting was:
| Quote: | If you like -march=native for Core I7, it should work the same on Core I5.
I've used only x86_64 on Core I7, generally setting -mtune=barcelona
-funroll-loops --param -unroll-times=4 -msse4
If your gcc isn't new enough to support those options, obviously Core I5
wasn't available when it was written up.
|
which IMO might work well since the core i7 architecture has rather large cache _________________ Unofficial minimal livecd x86/amd64 w/reiser4+truecrypt (by Neo2)
2.6.37.2_plus_v1: BFS, CFS,THP,compaction, zcache or TOI
Hardcore Linux user since 2004  |
|
| Back to top |
|
 |
Shining Arcanine Veteran

Joined: 24 Sep 2009 Posts: 1110
|
Posted: Thu Mar 25, 2010 3:39 pm Post subject: Re: make.conf parameters for core i7 |
|
|
| lineMain wrote: | Hi,
where can i find detailed information about cflags parameters for i7 (march=nehalem ???, should i specifically indicate that i want 64-bit, etc...)
In Portage arch types, I should choose ia64 right?
could someone clarify these issues?
thanks in advance... |
IA64 is the Itanium ISA. You want to use amd64.
If you are using GCC 4.3.0 or higher, which you likely are, use:
| Code: | CFLAGS="-O3 -march=core2 -mcx16 -msahf -msse4 -pipe"
CXXFLAGS="${CFLAGS}"
FCLAGS="${CFLAGS}"
FFLAGS="${CFLAGS}" |
I know that people here do not like -O3, but I have a Core 2 Q9500 with Gentoo Linux running on it in a virtual machine using the x86_64 ISA and -O3 appears to be beneficial on it. I have a 32-bit Gentoo Linux virtual machine using -O2 on the same system and things seem to fly on the 64-bit machine while the 32-bit machine seems very laggy in comparison. I know that this is a flawed comparison between -O2 and -O3, but at the very least it shows that -O3 will work. I think -O3 does well in the virtual machine because the processor has two 6MB L2 caches, although it is possible that the improvement is entirely from the 64-bit ISA, but given how similar the 32-bit and 64-bit versions of x86 are, I do not think that is the case.
The Core i7, making more effective use of its cache than my Core 2, should get more of a benefit from -O3 than my virtual machine. Someone mentioned using "-funroll-loops --param -unroll-times=4" in the CFLAGS. If the original poster wants, he could probably add those to the CFLAGS variable too. Whether or not -O3 and -funroll-loops yield an improvement will depend on the professor's cache, but with a Core i7, the original poster's processor should have ample quantities of cache to spare. |
|
| Back to top |
|
 |
Gentoo4Work n00b

Joined: 20 Mar 2010 Posts: 39
|
Posted: Sun Mar 28, 2010 7:40 am Post subject: |
|
|
I've currently got a (funtoo) system up, running, and stable. GCC is 4.3.3r-something, gentoo-sources-2.6.33. Pulled from the core2 funtoo repos, rather than the amd64, for whatever difference that makes. The system was built with -march=core2 -O2 -pipes.
Hardware is 2x Xeon X5570's.
Could I simply boot sysrescueCD, tar my installation, copy it onto a few new partitions of the same disk, switch compiler options, emerge -uaDV world, and run benchmarks (assuming they all survive the upgrade)? I've been really curious about how much of a difference compilation options make for non-specific application, but don't want to risk breaking my working installation. |
|
| Back to top |
|
 |
d2_racing Moderator


Joined: 25 Apr 2005 Posts: 12867 Location: Ste-Foy,Canada
|
Posted: Sun Mar 28, 2010 7:46 pm Post subject: |
|
|
| Gentoo4Work wrote: | | Could I simply boot sysrescueCD, tar my installation, copy it onto a few new partitions of the same disk, switch compiler options, emerge -uaDV world, |
Almost, changing cflags require to run this actually :
| Code: |
# emerge -e system
# emerge -e world
|
Cflags changes are not catched by emerge -auDNv world _________________ Sysadmin of Funtoo-Québec.org
Wiki
Signature
IRC on Freenode : #funtoo-quebec |
|
| Back to top |
|
 |
depontius Veteran

Joined: 05 May 2004 Posts: 1925
|
Posted: Fri Sep 17, 2010 2:19 pm Post subject: |
|
|
New CoreI7 user...
I'm just installing a new Thinkpad W510 with a CoreI7, so this thread popped up when I started thinking about revisiting "/etc/make.conf". To begin, I had:
| Code: | CFLAGS="-O2 -march=native -pipe"
MAKEOPTS="-j5" |
Which has always seemed to be the safe thing to do. Then last night looking at "top" started to make me think that all 4 cores weren't being used. So today I emerged "systat" to see at least some time being accrued on all *8* cores, and indeed /proc/cpuinfo shows this thing to think it has 8 cores. I guess that's 4 cores, each with 2-way SMT?? Normally I've always heard to set "-j" to one more than the number of CPUs. Should I be using "-j5" or "-j9"? Seems to me that calling 4 cores 8 CPUs on the basis of SMT is cheating a bit.
Also, I found my way to this topic, and it appears that the gcc-4.4.3 that I'm using predates CoreI7, in which case what it does for "-march=native"? I've changed my CFLAGS tp "-O2 -march=core2 -mtune=generic -pipe" for now.
Strangely enough, while emerging xulrunner, which ought to be able to chew up everything in gcc, I get:
| Code: | localhost ~ # mpstat -P ALL
Linux 2.6.34-gentoo-r6 (localhost) 09/17/10 _x86_64_ (8 CPU)
10:15:53 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
10:15:53 all 0.04 15.80 2.10 0.39 0.00 0.01 0.00 0.00 81.66
10:15:53 0 0.01 8.69 0.96 0.08 0.00 0.00 0.00 0.00 90.26
10:15:53 1 0.05 18.78 2.77 0.14 0.00 0.00 0.00 0.00 78.26
10:15:53 2 0.01 11.65 1.41 0.05 0.00 0.00 0.00 0.00 86.88
10:15:53 3 0.05 21.09 2.86 0.10 0.00 0.00 0.00 0.00 75.90
10:15:53 4 0.05 10.67 1.56 1.75 0.00 0.00 0.00 0.00 85.98
10:15:53 5 0.05 17.96 2.87 0.81 0.00 0.04 0.00 0.00 78.26
10:15:53 6 0.02 7.22 1.38 0.08 0.00 0.03 0.00 0.00 91.25
10:15:53 7 0.04 30.41 2.98 0.12 0.00 0.00 0.00 0.00 66.44 |
It shows all 8 "CPUs" being used, but at a level that could be met by a single core. I've got
| Code: | CONFIG_USE_GENERIC_SMP_HELPERS=y
CONFIG_X86_64_SMP=y
CONFIG_SMP=y |
in .config, but these utilization numbers just don't look to hot, nor do my elapsed times. Am I missing something? _________________ .sigs waste space and bandwidth |
|
| Back to top |
|
 |
krinn Advocate


Joined: 02 May 2003 Posts: 3204
|
Posted: Fri Sep 17, 2010 3:15 pm Post subject: |
|
|
Because you've misunderstood -j from emerge and the -j from make.conf
the -j from make.conf (the makeopts) is to feed your cpu while doing make: it will allow it to run as many gcc as you set
the -j from emerge is to feed your cpu while emerging: it will run many emerge as you allow it
(i will get back just on that just after that):
Ok so you've set -j9 in makeopts, allow 9 gcc runs
But it's how make works that prevent your cores to run at 100%
generally, this is what is happening: make need to be built 3 files, then with the 3 files it will be able to build a 4rd one, then again... upto end of compile.
So in this case, even you said, ok, use 9 gcc, make cannot do that and will only run 3 gcc, than 1...
because it need that file to be built and it need 3 files built before.
So you get
at 0s
file1: core1 building but need 3s to complete
file2: core2 building but need 5s to complete
file3: core3 building but need 10s to complete
file4: nothing for now, waiting 3 others to be built to run that one, will take 5s to build it
all others cores, wanking...
at 10s
file1: done
file2: done
file3: done
file4: core4 building for 5s
all others cores wanking...
So if you look at your stats, you'll see real poor stats
core1: was working full only 3s, 20% (because 3s out of 15s working)
core2: was working full only 5s 33%
core3: was working full only 10s 66%
core4: was doing nothing for 10s and working full for 5s 33%
Hu, that's ugly stats
As you see it just mean you've find a make that cannot really feed your cores with enough "real" jobs to put them down
now the emerge -j4 (--jobs) options, this time you allow emerge to run multiple emerge jobs.
this mean running 4 emerge at once, so you need to feed emerge with that, best way would be emerge system or world but let's take a sample
emerge -j4 package1 package2 package3
again: asking -j4 but only 3 package, if no dependency is need, only 3 emerge will run, see how it's hard to feed the beast?
Ok, i'm not sure for that next one if emerge take the makeopts as global or per package settings
if global : emerge -j4 with makeopts=9 = 9 = upto 9 gcc
if per emerge: emerge -j4 with makeopts=9 = 4*9 = upto 36 gcc run (this is what happen if you do emerge package1 & emerge package2 & emerge package3)
So, if you really need to feed your cores, use emerge -j8 will help much than makeopts=-j8, and using them both, you'll have a setup to really see the core7 performance.
the -j8 for emerge is to be set in EMERGE_DEFAULT_OPTS="-j8"
the -j8 for make is to be set in MAKEOPTS="j8"
You should try emerge prll
prll is a tool that will do the --jobs function but for any program, let's say you have to convert all *.mp3 to ogg with an mp3_2_ogg tools, prll will feed your cores with your number of cores jobs (ie: for you, running upto 8 mp3_2_ogg tools).
A nice tools for multi-cores users.
(ps: btw: using -march=core2 -mtune=generic => allow gcc to use any core2 code but limit the usage of that set to only build code that could run on any x86_64 computers. So you've just limit your code optimisation. Set -march=core2 -mtune=core2) |
|
| Back to top |
|
 |
depontius Veteran

Joined: 05 May 2004 Posts: 1925
|
Posted: Fri Sep 17, 2010 3:49 pm Post subject: |
|
|
Thanks for the tips, I see your point. When my extra DRAM gets here, I'm thinking of mounting tmpfs over /var/tmp/portage with a decent max size. Then after emerge has unpacked the source, the whole thing runs in RAM. I suspect that will do more for performance than anything with "-j", though I would also expect it to make "-j" work better.
I was unaware that "-j" could be used as an option to emerge - I presume when package2 directly requires package1, things still turn out to be serialized?
As for the CFLAGS, why would the "safe CFLAGS" page have said to use "-mtune=generic"? Isn't "-march=core2" equivalent to "-mcpu=core2 -mtune=core2"? I also saw stuff up-thread here about having some other flags like "-sss3e" and such. Any comment on those? Is gcc-4.5 supposed to directly support CoreI7? _________________ .sigs waste space and bandwidth |
|
| Back to top |
|
 |
krinn Advocate


Joined: 02 May 2003 Posts: 3204
|
Posted: Fri Sep 17, 2010 4:56 pm Post subject: |
|
|
| depontius wrote: |
As for the CFLAGS, why would the "safe CFLAGS" page have said to use "-mtune=generic"? Isn't "-march=core2" equivalent to "-mcpu=core2 -mtune=core2"? I also saw stuff up-thread here about having some other flags like "-sss3e" and such. Any comment on those? Is gcc-4.5 supposed to directly support CoreI7? |
Well, that's unofficial gentoo wiki, and many errors are in it, page is outdated. By this time older gcc version weren't supporting core2, so i suppose it was good to say -march=nocona mtune=generic, well, it was already a mistake, better -march=nocona -mtune=nocona too.
So was the mistake about 32 and 64bits, -march=nocona is good for 32bits not only 64bits, they were thinking because the nocona have 64bits instructions, it was bad to allow gcc to use it.
Like if gcc was dumb enough to produce 64bits code were it see a 32bits arch...
Take the core2 example, following wiki logic, only 64bits users could use -march=core2. Lol you can't imagine how many users are there with -march=prescott & -march=nocona base on there 32/64bits gentoo, where they can just set -march=core2, like a 32bits user with a nocona can simply set it as nocona too.
"-march=core2" is only equivalent to "-march=core2 -mtune=core2" if you don't tell gcc -mtune=something. mcpu doesn't exist anymore, mtune is mcpu
So except on purpose, -march=native will do the work, gcc should detect anything your cpu can do and enable it where possible. And many flags auto-imply some others. Like for example picking up mfpmath=sse imply sse.
I don't think anyone wish gcc optimize more their code than gcc devs no ?
User have still plenty options gcc won't enable if not ask too, because it's a bit specific usage (let's say like the mfpmath one, or the pipe) |
|
| Back to top |
|
 |
kklatt n00b

Joined: 25 Oct 2004 Posts: 9
|
Posted: Wed Jan 19, 2011 9:27 pm Post subject: Core I7 question |
|
|
When reading the posts on I7 and portage, I saw mention of using the AMD64 base rather than the x86 -- Did I misunderstand?
-- Update : Went ahead and installed the AMD64 stage 3, working fine.. I did note the emerge -j option, did not affect the system build and updates, as the packages being built depends limited the builds to 2. Using a small 32G SSHD as the system drive, machine has 12G of ram. Neglected to think how large swap should be at 3x memory, not a practical issue for a while. Building GNOME, passed the emerge with j-6 -- first died on building the library icu, rebuild icu by itself and worked finem same with libatasmart. /var/tmp is set to 4G, did not see a space problem (cause for review later.) Building 4 packages (j-4) , seems to work the best, unsure as to why. All in all, working well -- |
|
| Back to top |
|
 |
BlackMajick64 n00b

Joined: 02 Apr 2011 Posts: 2
|
Posted: Sun Jul 31, 2011 8:44 am Post subject: Corei7 CFLAGS |
|
|
Just wanted to post my experience with Corei7 on Gentoo x86_64...... the following is what I have found after 3 months of off and on testing:
... march=core2, march=nocona, march=native are now completely obsolete with GCC 4.6.x using Corei7.
I have recompiled with emerge -eDv world (20x loop of this) on several occasions using GCC 4.6.0 and 4.6.1 with the following:
CFLAGS="-O3 -march=corei7 -mtune=corei7 -pipe -funsafe-loop-optimizations -funsafe-math-optimizations -msse4.2"
works like majick - if you dont' have GCC 4.6.x yet give it a whirl, it doesn't break anything for me as of yet, and as I've said I always recompile @world just to be safe and in my case it's 900+ packages. i then prelink everything and it works as expected.
also consider upping your MAKEOPTS.... not sure why the "threads +1" is still recommended, I use MAKEOPTS=-j100 all the time with 24GB of RAM. if you can afford to tie up your machine CPU and memory wise for awhile to do a recompile do yourself a favor and try it out. it does not slow down the compiles at all, but makes them fly. the corei7 will rip right through it , better than Xeons on most occasions I notice.
Try putting it to 30-40 if you have less RAM, the corei7 will fly. Don't be afraid to try upping the MAKEOPTS to a ridiculous number, you can ALWAYS go back. A couple of programs fail, like Boost regex for example. This is simply fixed by reducing the MAKEOPTS (in my case with 24GB going down to -j60 works like a charm).
However with using GCC 4.6.x a word of warning (this is already clear in the portage messages when you emerge) - don't use the LTO it breaks a ton of stuff right now for me. Do NOT use -mavx if you do not have an Intel Sandy Bridge that has AVX extensions - after you emerge zlib it will break everything, including your compiler , with no way that I have been able to see to fix it. format/re-install for me has been necessary and I will no make that mistake again. _________________ Asus Rampage III Extreme
24GB Samsung DDR3-1600MHz
I7 990x Extreme Edition Hexcore w/HT o/c 4.2Ghz
5x Intel 80GB SSD/RAID0
4x 7.2K 2TB/RAID5
2x Geforce GTS 250/SLI
50Mb/s bizclass cable
2.6.39-r3
MAKEOPTS="-j100" |
|
| Back to top |
|
 |
myceliv Apprentice


Joined: 29 Nov 2007 Posts: 178
|
Posted: Sun Jul 31, 2011 9:28 am Post subject: |
|
|
Quick side note: a good way to deal with packages that want MAKEOPTS="-j1" or whatever... say dev-libs/icu ... is
| Code: | mkdir /etc/portage/env/dev-libs
echo 'MAKEOPTS="-j1"' > /etc/portage/env/dev-libs/icu
|
Then you don't have to remember next time, or curse when the build fails and use MAKEOPTS on the command line when you try again. |
|
| Back to top |
|
 |
kosik n00b

Joined: 09 Nov 2007 Posts: 12 Location: 127.0.0.1
|
Posted: Fri Feb 10, 2012 11:45 am Post subject: |
|
|
| myceliv wrote: | Quick side note: a good way to deal with packages that want MAKEOPTS="-j1" or whatever... say dev-libs/icu ... is
| Code: | mkdir /etc/portage/env/dev-libs
echo 'MAKEOPTS="-j1"' > /etc/portage/env/dev-libs/icu
|
Then you don't have to remember next time, or curse when the build fails and use MAKEOPTS on the command line when you try again. |
Hey, that tip was really, really nice! Didn't know that ...  |
|
| Back to top |
|
 |
kosik n00b

Joined: 09 Nov 2007 Posts: 12 Location: 127.0.0.1
|
Posted: Fri Feb 10, 2012 11:47 am Post subject: Re: Corei7 CFLAGS |
|
|
| BlackMajick64 wrote: | Just wanted to post my experience with Corei7 on Gentoo x86_64...... the following is what I have found after 3 months of off and on testing:
However with using GCC 4.6.x a word of warning (this is already clear in the portage messages when you emerge) - don't use the LTO it breaks a ton of stuff right now for me. Do NOT use -mavx if you do not have an Intel Sandy Bridge that has AVX extensions - after you emerge zlib it will break everything, including your compiler , with no way that I have been able to see to fix it. format/re-install for me has been necessary and I will no make that mistake again. |
I have zlib compiled with gcc 4.6.2 with LTO. The problem is probably the -O3 thing. I used to use -O3 years ago but there are too many packages that do not feel comfortable with -O3. Use -O2 and you can probably use LTO on most packages as of today. The only problem I have is glib and libX11. Anything else that _can be_ compiled (bash ie. cannot) runs very fine with LTO... |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|