View previous topic :: View next topic |
Author |
Message |
P1neapple n00b
Joined: 18 Jul 2014 Posts: 35
|
|
Back to top |
|
|
xaviermiller Bodhisattva
Joined: 23 Jul 2004 Posts: 8708 Location: ~Brussels - Belgique
|
Posted: Fri Jul 18, 2014 9:42 am Post subject: |
|
|
Simply use "-march=native -mtune=native", which will give the best for your CPU. _________________ Kind regards,
Xavier Miller |
|
Back to top |
|
|
P1neapple n00b
Joined: 18 Jul 2014 Posts: 35
|
Posted: Fri Jul 18, 2014 10:57 am Post subject: |
|
|
Yeah I heard of that option before.
However I am curious if k8 will work on an Intel i3.
PS I also heard that not including march at all will make it default to native. Is this true? _________________ Gentoo currently running in Virtualbox, hoping to switch to real hardware soon... |
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3136
|
Posted: Fri Jul 18, 2014 8:43 pm Post subject: |
|
|
Quote: | I also heard that not including march at all will make it default to native. Is this true? | it's very likely.
Quote: | However I am curious if k8 will work on an Intel i3. | as long as i3 is at least the same arch as k8. I think that would mean it must be x86_64.
Most of stuff should just work, media players are likely to make heavy use of CPU or even crash though, as different CPUs not always share the same extensions. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54237 Location: 56N 3W
|
Posted: Fri Jul 18, 2014 9:22 pm Post subject: |
|
|
P1neapple,
-march=k8 will tell gcc to use instructions taht only appear on AMD processors.
That will break multimedia apps mostly
man gcc: | -march=cpu-type
Generate instructions for the machine type cpu-type. In contrast to -mtune=cpu-type, which merely tunes the generated code for
the specified cpu-type, -march=cpu-type allows GCC to generate code that may not run at all on processors other than the one
indicated. Specifying -march=cpu-type implies -mtune=cpu-type.
The choices for cpu-type are:
native
This selects the CPU to generate code for at compilation time by determining the processor type of the compiling machine.
Using -march=native enables all instruction subsets supported by the local machine (hence the result might not run on
different machines). Using -mtune=native produces code optimized for the local machine under the constraints of the
selected instruction set.
core2
Intel Core 2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and SSSE3 instruction set support.
k8
opteron
athlon64
athlon-fx
Processors based on the AMD K8 core with x86-64 instruction set support, including the AMD Opteron, Athlon 64, and Athlon
64 FX processors. (This supersets MMX, SSE, SSE2, 3DNow!, enhanced 3DNow! and 64-bit instruction set extensions.)
|
You will not have 3DNow! nor enhanced 3DNow! _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
P1neapple n00b
Joined: 18 Jul 2014 Posts: 35
|
Posted: Fri Jul 18, 2014 10:09 pm Post subject: |
|
|
Alright I re-read the page in the handbook and followed your advice and it seems like march=native is the best way to go
Thanks _________________ Gentoo currently running in Virtualbox, hoping to switch to real hardware soon... |
|
Back to top |
|
|
john_deaux n00b
Joined: 16 Sep 2013 Posts: 56 Location: On the banks of the Pontchartrain
|
Posted: Sat Jul 19, 2014 1:11 am Post subject: |
|
|
I know that this is after the fact since you used "native", but I remember running something that told me what to put in the "-march=" for my Lenovo ThinkPad T440s.
I first consulted the the page here: http://wiki.gentoo.org/wiki/Lenovo_ThinkPad_T440s which recommended "-march=core-avx-i -mavx" as a safe CFLAGS entry. But the thing that I ran showed that I should use "-march=corei7-avx -mavx2".
Again, I can't remember what I ran to determine that as I wasn't taking detailed notes then.
J_D |
|
Back to top |
|
|
Budoka l33t
Joined: 03 Jun 2012 Posts: 777 Location: Tokyo, Japan
|
Posted: Sat Jul 19, 2014 2:11 am Post subject: |
|
|
XavierMiller wrote: | Simply use "-march=native -mtune=native", which will give the best for your CPU. |
What does mtune do? |
|
Back to top |
|
|
Cyker Veteran
Joined: 15 Jun 2006 Posts: 1746
|
Posted: Sat Jul 19, 2014 6:00 am Post subject: |
|
|
It's not very well explained, but as far as I could make out, mtune optimizes the generated code style for a particular CPU, but without using the special instructions for that CPU. (e.g. -mtune pentiumpro would optimise the code for things like superscalar and out-of-order execution on the pentium pro but without actually using pentium-pro specific instructions, so it'd still run on e.g. a 386)
IIRC you don't need to specify -mtune if you are using -march as e.g. -march <moocow> will also default -mtune to <moocow> as well (I suppose you could set -march and -mtune to different CPUs if you had a new CPU that had the instruction set of, e.g. a Kaveri but had a codepath like a 386 XD) |
|
Back to top |
|
|
khayyam Watchman
Joined: 07 Jun 2012 Posts: 6227 Location: Room 101
|
Posted: Sat Jul 19, 2014 9:13 am Post subject: |
|
|
Cyker wrote: | IIRC you don't need to specify -mtune if you are using -march as e.g. -march <moocow> will also default -mtune to <moocow> as well (I suppose you could set -march and -mtune to different CPUs if you had a new CPU that had the instruction set of, e.g. a Kaveri but had a codepath like a 386 XD) |
Cyker ... actually it will default to '-mtune=generic', but as all cows are sacred one moo is as good as another ;)
Code: | # gcc '-###' -e -v -march=native /usr/include/stdlib.h 2>&1 | grep mtune
/usr/libexec/gcc/i686-pc-linux-gnu/4.7.3/cc1 -quiet /usr/include/stdlib.h "-march=pentium-m" -mno-cx16 -mno-sahf -mno-movbe -mno-aes -mno-pclmul -mno-popcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -mno-sse4.2 -mno-sse4.1 -mno-lzcnt -mno-rdrnd -mno-f16c -mno-fsgsbase --param "l1-cache-size=32" --param "l1-cache-line-size=64" --param "l2-cache-size=2048" "-mtune=generic" -quiet -dumpbase stdlib.h -auxbase stdlib -o /home/khayyam/tmp/ccDfVfVF.s "--output-pch=/usr/include/stdlib.h.gch" |
best ... khay |
|
Back to top |
|
|
Ant P. Watchman
Joined: 18 Apr 2009 Posts: 6920
|
Posted: Sat Jul 19, 2014 7:35 pm Post subject: |
|
|
For some common CPUs, "generic" is the optimised setting. I get a different one here:
Code: | /usr/libexec/gcc/i686-pc-linux-gnu/4.8.3/cc1 -quiet /usr/include/stdlib.h "-march=atom" -mno-cx16 -msahf -mmovbe -mno-aes -mno-pclmul -mno-popcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -mno-sse4.2 -mno-sse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mno-xsave -mno-xsaveopt --param "l1-cache-size=24" --param "l1-cache-line-size=64" --param "l2-cache-size=512" "-mtune=atom" -quiet -dumpbase stdlib.h -auxbase stdlib -o /tmp/ccaAuOKJ.s "--output-pch=/usr/include/stdlib.h.gch" |
|
|
Back to top |
|
|
krinn Watchman
Joined: 02 May 2003 Posts: 7470
|
Posted: Sat Jul 19, 2014 11:27 pm Post subject: |
|
|
You can limit code to i686 but still optimize its usage for corei7 (in that case, it make poor sense, just to provide i686 code that run better when run on corei7 if possible)
Code: | gcc '-###' -e -v -march=i686 -mtune=corei7 /usr/include/stdlib.h 2>&1 | grep mtune
COLLECT_GCC_OPTIONS='-e' '-v' '-march=i686' '-mtune=corei7'
/usr/libexec/gcc/i686-pc-linux-gnu/4.7.4/cc1 -quiet /usr/include/stdlib.h -quiet -dumpbase stdlib.h "-march=i686" "-mtune=corei7" -auxbase stdlib -o /tmp/ccYnE3PS.s "--output-pch=/usr/include/stdlib.h.gch" |
Not all optimizations are cpu code base, some are done by branching size, caching, or a cpu family strength (or weakness) ; like a stupid example if a cpu is better at /2 than shifting, you can ask march=i686 mtune=stupidcpu and compiler should provide i686 code that always use /2 instead of shifting because mtune tell it the cpu is weak at doing it ; providing i686 code, that indeed will run faster on that cpu.
So -march will use the cpu family code, while mtune will optimize the selected -march code to run (better) on a specific cpu type.
Some cpu family have no specific strength/weakness or tweak (or it wasn't made for them), so you only have generic, but still you get optimize code for them because of the -march selected code.
It's then logic if -mtune is not specified that you provide code optimize for the cpu family the -march was set for as default, and to generic when none exist.
You can even provide something that might looks weird : -march=corei7 -mtune=pentiumpro if you know your cpu can run corei7 code but do poor result with corei7 optimizations and play nice with pentiumpro optimization style.
And in theory a pentiumpro will run fine with -march=pentiumpro -mtune=corei7 as you are asking valid code generation, just the optimization won't be that good (in theory, because i'm not sure all gcc devs have stick to that).
To sum up, i don't know why one would want do that, but it should work to use intel cpu with -march=x86-64 -mtune=k8, while -march=k8 is sure failure as no intel cpu handle 3dnow set. |
|
Back to top |
|
|
Cyker Veteran
Joined: 15 Jun 2006 Posts: 1746
|
Posted: Sun Jul 20, 2014 9:49 am Post subject: |
|
|
khayyam wrote: | Cyker ... actually it will default to '-mtune=generic', but as all cows are sacred one moo is as good as another
|
Are you sure? My gcc manpage says
Code: |
-march=cpu-type
Generate instructions for the machine type cpu-type. The choices
for cpu-type are the same as for -mtune. Moreover, specifying
-march=cpu-type implies -mtune=cpu-type.
|
|
|
Back to top |
|
|
khayyam Watchman
Joined: 07 Jun 2012 Posts: 6227 Location: Room 101
|
Posted: Sun Jul 20, 2014 10:59 am Post subject: |
|
|
Cyker wrote: | khayyam wrote: | Cyker ... actually it will default to '-mtune=generic', but as all cows are sacred one moo is as good as another ;) |
Are you sure? |
Cyker ... well, as Ant P. points out above this is not the same for all architectures, some will set this to the same as -march. I'd always assumed that 'generic' was the default ... I guess as I've never seen it set to anything other than that when -march was set.
Cyker wrote: | My gcc manpage says
Code: | -march=cpu-type
Generate instructions for the machine type cpu-type. The choices
for cpu-type are the same as for -mtune. Moreover, specifying
-march=cpu-type implies -mtune=cpu-type. |
|
Yes, but as you see in the output from gcc above "implies" doesn't necessarily equate to "sets".
best ... khay |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6098 Location: Dallas area
|
Posted: Sun Jul 20, 2014 11:38 am Post subject: |
|
|
khayyam wrote: | Cyker wrote: | khayyam wrote: | Cyker ... actually it will default to '-mtune=generic', but as all cows are sacred one moo is as good as another |
Are you sure? |
Cyker ... well, as Ant P. points out above this is not the same for all architectures, some will set this to the same as -march. I'd always assumed that 'generic' was the default ... I guess as I've never seen it set to anything other than that when -march was set. |
On older cpu's I think mtune defaulted to generic if not set, with newer chips, it seems to follow the march flag.
When I first started paying attention to those flags, I was using 4.7.2 now 4.7.3, not sure if there have been changes internal to gcc in that regard. _________________ PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland |
|
Back to top |
|
|
nlsa8z6zoz7lyih3ap Guru
Joined: 25 Sep 2007 Posts: 388 Location: Canada
|
Posted: Sun Jul 20, 2014 2:43 pm Post subject: |
|
|
Quote: | PS I also heard that not including march at all will make it default to native. Is this true? |
I don't think so. (Although I am never sure of anything in this regard.)
Here is my reason: http://wiki.gentoo.org/wiki/CFLAGS contains the following advice
Quote: | To see what -march=native or -mtune=native enables for your specific CPU, run gcc -march=native -E -v - </dev/null 2>&1 | grep cc1 |
On my amd fx-8350 here are some sample outputs, using gcc-4.9.0:
(1) Code: | gcc -march=native -E -v - </dev/null 2>&1 | grep cc1
/usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1 -E -quiet -v - -march=bdver2 -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -msse4a -mcx16 -msahf -mno-movbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mlwp -mfma -mfma4 -mxop -mbmi -mno-bmi2 -mtbm -mavx -mno-avx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mno-rdrnd -mf16c -mno-fsgsbase -mno-rdseed -mprfchw -mno-adx -mfxsr -mxsave -mno-xsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver2 -fstack-protector-strong |
(2)
Code: | gcc -mtune=native -E -v - </dev/null 2>&1 | grep cc1
/usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1 -E -quiet -v - --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver2 -march=x86-64 -fstack-protector-strong |
and (no -march and no -mtune)
(3)
Code: | gcc -E -v - </dev/null 2>&1 | grep cc1
/usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1 -E -quiet -v - -mtune=generic -march=x86-64 -fstack-protector-strong |
My own experience is also that omitting -march and using -march=native result in different compilation failures when used in conjunction with other (aggressive) cflags
such as lto. |
|
Back to top |
|
|
EmaRsk Apprentice
Joined: 07 Sep 2004 Posts: 158 Location: Italy
|
Posted: Mon Jul 21, 2014 4:35 pm Post subject: |
|
|
nlsa8z6zoz7lyih3ap wrote: | omitting -march and using -march=native |
I guess you meant "omitting -march and using -mtune=native".
For me
Code: | gcc $HOLYCOW -E -v - </dev/null 2>&1 | grep cc1 |
gives this results:
"-march=native -mtune=native" == "-march=native" != "-mtune=native" != ""
So, the omission of -march doesn't imply native, while -march implies -mtune. |
|
Back to top |
|
|
nlsa8z6zoz7lyih3ap Guru
Joined: 25 Sep 2007 Posts: 388 Location: Canada
|
Posted: Tue Jul 22, 2014 4:21 pm Post subject: |
|
|
Over many years I have really enjoyed fussing with CFLAGS and USE Flags concerning various hardware features of the chip that I am actually using.
However I have never really noticed any significant performance benefit from doing so. Are there any opinions as to whether setting
in the CFLAGS or just entirely omitting -march make much practical difference at all?
Regardless of the answer, I will continue to experiment with CFLAGS as it is fun and I learn lots by doing so.
Current cpu: FX-8350
PS: Currently playing with -lto and graphitte options. |
|
Back to top |
|
|
|