View previous topic :: View next topic |
Author |
Message |
Oleksa75 n00b
Joined: 18 Oct 2021 Posts: 15
|
Posted: Tue Mar 12, 2024 6:24 pm Post subject: What processor instructions should be in march and mtune? |
|
|
Hello there!
Please, I need some advice.
I've got a new processor and ran for detailed instruction set next command:
Quote: | >># gcc -march=native -v -E - < /dev/null 2>&1 | grep cc1 | perl -pe 's/ -mno-\S+//g; s/^.* - //g;' |
The output was next:
-march=alderlake -mmmx -mpopcnt -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx -mavx2 -mfma -mbmi -mbmi2 -maes -mpclmul -mgfni -mvpclmulqdq -madx -mabm -mclflushopt -mclwb -mcx16 -mf16c -mfsgsbase -mfxsr -msahf -mlzcnt -mmovbe -mmovdir64b -mmovdiri -mpku -mprfchw -mptwrite -mrdpid -mrdrnd -mrdseed -mserialize -msha -mshstk -mvaes -mwaitpkg -mxsave -mxsavec -mxsaveopt -mxsaves -mhreset -mavxvnni --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=12288 -mtune=alderlake -dumpbase -
My question is: should I add ALL these parameters into my CFLAGS / CXXFLAGS / RUSTFLAGS / GOFLAGS etc?
I am pretty nub in Gentoo but I want recompile whatever for my new processor.
Thanks you for answer! |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4167 Location: Bavaria
|
Posted: Tue Mar 12, 2024 6:32 pm Post subject: |
|
|
Take the simple:
Code: | COMMON_FLAGS="-march=native -O2 -pipe" |
and gcc will recognize your CPU and optimize it for this CPU (and -mtune is then no longer necessary at all). _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
Oleksa75 n00b
Joined: 18 Oct 2021 Posts: 15
|
Posted: Tue Mar 12, 2024 7:25 pm Post subject: |
|
|
pietinger wrote: | Take the simple:
Code: | COMMON_FLAGS="-march=native -O2 -pipe" |
and gcc will recognize your CPU and optimize it for this CPU (and -mtune is then no longer necessary at all). |
Lieber Herr Pietinger, danke fuer die Antwort! Ich bin Ihnen sehr dankbar! |
|
Back to top |
|
|
logrusx Veteran
Joined: 22 Feb 2018 Posts: 1547
|
Posted: Tue Mar 12, 2024 7:35 pm Post subject: |
|
|
You can try app-misc/resolve-march-native if you want it "expanded", but it only makes sense if you're configuring distcc or if you're curious.
Best Regards,
Georgi |
|
Back to top |
|
|
CaptainBlood Advocate
Joined: 24 Jan 2010 Posts: 3628
|
Posted: Tue Mar 12, 2024 7:50 pm Post subject: |
|
|
pietinger is right keeping it simple.
AFAIK, for amd64 arch here -mtune=native -march=core2 does compile for both native and core2 within same generated binaries.
if mtune(native here) part of the code isn't supported, system will attempt to run -march(core2 here) part of the code: e.g. this would allow to bring HD to a lower system (down to core2) for a chroot and a fix.(Remember bash is the first executable called by chroot)
Beware there is a cost however in binary size.
In arm context -march -mtune may have a slightly different meaning.
Thks 4 ur attention, interest & support. _________________ USE="-* ..." in /etc/portage/make.conf here.
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. "
Last edited by CaptainBlood on Wed Mar 13, 2024 9:01 am; edited 1 time in total |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6098 Location: Dallas area
|
Posted: Tue Mar 12, 2024 8:08 pm Post subject: |
|
|
Depends on what you're trying to do.
If the code will always and only run on your current processor then -march=native is fine, and leave mtune alone as it dupes march.
If you want to possibly run it on a variety of (modern) processors then set march to something like x86-64-v3 and set mtune to native.
That will allow the code to run on most modern x86 processors while tuning the code for your particular processor. _________________ PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland |
|
Back to top |
|
|
logrusx Veteran
Joined: 22 Feb 2018 Posts: 1547
|
Posted: Wed Mar 13, 2024 8:25 am Post subject: |
|
|
CaptainBlood wrote: | pietinger is right keeping it simple. |
Yes, but this way you don't get explanations like yours and Anon-E-moose's and you don't learn much new stuff :)
Best Regards,
Georgi |
|
Back to top |
|
|
CaptainBlood Advocate
Joined: 24 Jan 2010 Posts: 3628
|
Posted: Wed Mar 13, 2024 8:48 am Post subject: |
|
|
I meant at start.
app-portage/cpuid2cpuflags should help setting e.g. here: Code: | cat /etc/portage/package.use/00-skylake-flags-monolithic.conf
*/* CPU_FLAGS_X86: aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sse sse2 sse3 sse4_1 sse4_2 ssse3 #skylake |
File name is yours. However heading with 00 is a way to ensure definition to be taken into account early enough.
rust & go are # stories.
Thks 4 ur attention, interest & support. _________________ USE="-* ..." in /etc/portage/make.conf here.
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. " |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2009
|
Posted: Wed Mar 13, 2024 9:17 am Post subject: |
|
|
CaptainBlood wrote: | ...
AFAIK, for amd64 arch here -mtune=native -march=core2 does compile for both native and core2 within same generated binaries.
if mtune(native here) part of the code isn't supported, system will attempt to run -march(core2 here) part of the code.
... |
AFAIK, that combination produces code for core2, and only core2, but will adjust the optimizations according to the timings for the "native" hardware; there's no hardware-conditional code generation (I think there are options for that, but they're not -mtune). _________________ Greybeard |
|
Back to top |
|
|
CaptainBlood Advocate
Joined: 24 Jan 2010 Posts: 3628
|
Posted: Wed Mar 13, 2024 11:02 am Post subject: |
|
|
Goverp,
You're likely to be right.
I tried to discuss the point years ago on the forum, but never get a answer clear to my understanding.
So my statement is indeed more of a personal view than a knowledge
Never took time to write, compile and bisect dummy but good enough C code in this respect.
Maybe it's time to focus as I consider this a serious matter.
Thks 4 ur attention, interest & support. _________________ USE="-* ..." in /etc/portage/make.conf here.
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. " |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6098 Location: Dallas area
|
Posted: Wed Mar 13, 2024 12:30 pm Post subject: |
|
|
March sets the base level to compile for, mtune sets optimizations at whatever level you set (generic is default)
So march core2 will run at that level of cpu, most modern processors could also run that binary.
But if you don't set mtune, then you get base level optimizations, ie practically none.
Setting march core2 and mtune skylake (for example) will optimize the core2 level code to run the best on the skylake. _________________ PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland |
|
Back to top |
|
|
CaptainBlood Advocate
Joined: 24 Jan 2010 Posts: 3628
|
Posted: Wed Mar 13, 2024 1:06 pm Post subject: |
|
|
As a result Code: | -march=older -mtune=native | should be considered sub-optimal proportional to the gap between the two settings.
As a consequence, native matching CPU_FLAGS_X86 influence will be downgraded to "older".
Thks 4 ur attention, interest & support. _________________ USE="-* ..." in /etc/portage/make.conf here.
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. " |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6098 Location: Dallas area
|
Posted: Wed Mar 13, 2024 2:43 pm Post subject: |
|
|
Core 2 is kind of old, I would shoot for the level of Haswell, at least.
But x86_64-v2 to v4 works for more modern processors.
https://en.wikipedia.org/wiki/X86-64 -- scroll down to see more about x85_64-v*
v2 is Intel Nehalem or Amd Bulldozer level
v3 is Intel Haswell or Amd Excavator level
v4 is Intel Skylake or Amd Zen4 level
setting these at the march level gets you general levels of performance, then tune on top of this. _________________ PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4167 Location: Bavaria
|
Posted: Wed Mar 13, 2024 2:48 pm Post subject: |
|
|
Anon-E-moose wrote: | v4 is Intel Skylake or Amd Zen4 level |
Are you sure ? AFAIK (but I can be wrong here) v4 needs AVX512 which is not supported by many new Intel CPUs
My i9-13900K (RaptorLake) gives me:
Code: | # ld.so --help
...
Subdirectories of glibc-hwcaps directories, in priority order:
x86-64-v4
x86-64-v3 (supported, searched)
x86-64-v2 (supported, searched) |
_________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
CaptainBlood Advocate
Joined: 24 Jan 2010 Posts: 3628
|
Posted: Wed Mar 13, 2024 3:41 pm Post subject: |
|
|
-march=core2 here, because alternate system to the skylake one is core2.
I want 2 be able to chroot one HD on another system for fixes.
skylake is headless, so prone to hard 2 fix possible issues, e.g. grub.
But this is a local case.
From a strict performance pov -march=native or equivalent seems king.
Thks 4 ur attention, interest & support. _________________ USE="-* ..." in /etc/portage/make.conf here.
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. " |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6098 Location: Dallas area
|
Posted: Wed Mar 13, 2024 5:46 pm Post subject: |
|
|
pietinger wrote: | Anon-E-moose wrote: | v4 is Intel Skylake or Amd Zen4 level |
Are you sure ? AFAIK (but I can be wrong here) v4 needs AVX512 which is not supported by many new Intel CPUs
My i9-13900K (RaptorLake) gives me:
Code: | # ld.so --help
...
Subdirectories of glibc-hwcaps directories, in priority order:
x86-64-v4
x86-64-v3 (supported, searched)
x86-64-v2 (supported, searched) |
|
I'm going by what the article said.
But having said that intel is a mixed bag, there's 2 skylakes one supports avx512 the other doesn't.
Not sure about the other variations of intel chips.
Interesting specifically talking about the 13600/13900
Quote: | And the new Intel CPUs don't support AVX-512 instructions, something Intel had been pushing in its CPUs for years. AVX-512 is present in the P-cores but permanently turned off because the E-cores don't support it. |
_________________ PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland |
|
Back to top |
|
|
logrusx Veteran
Joined: 22 Feb 2018 Posts: 1547
|
Posted: Wed Mar 13, 2024 7:33 pm Post subject: |
|
|
pietinger wrote: | Anon-E-moose wrote: | v4 is Intel Skylake or Amd Zen4 level |
Are you sure ? AFAIK (but I can be wrong here) v4 needs AVX512 which is not supported by many new Intel CPUs
My i9-13900K (RaptorLake) gives me:
Code: | # ld.so --help
...
Subdirectories of glibc-hwcaps directories, in priority order:
x86-64-v4
x86-64-v3 (supported, searched)
x86-64-v2 (supported, searched) |
|
AFAIK many new, including last generation, Intel processors are not v4 precisely for that reason :)
And for the same reason there's no v4 binhost repo.
Best Regards,
Georgi |
|
Back to top |
|
|
figueroa Advocate
Joined: 14 Aug 2005 Posts: 2963 Location: Edge of marsh USA
|
Posted: Thu Mar 14, 2024 4:23 am Post subject: |
|
|
This isn't definitive but I've wasted a lot of energy looking for the right answer for me. Apparently I didn't quite hit it.
I'm not looking for ultimate optimization. I'm not sure I would notice the difference. I'm running all older hardware from circa 2008-2012. What's in my /etc/portage/make.conf is the following:
Code: | # For i7-2600 (HP Pavilion HPE)
#CFLAGS="-O2 -march=x86-64 -mtune=sandybridge -pipe"
# for i5-3470 (hp elite 8300)
#CFLAGS="-O2 -march=x86-64 -mtune=ivybridge -pipe"
#
# Prior to 5/26/2023
#CFLAGS="-O2 -march=native -pipe"
# New 5/26/2023 for i7-2600 (HP Pavilion HPE)
CFLAGS="-O2 -march=x86-64 -mtune=sandybridge -pipe" |
Note that everything but the last line is commented out. I like to make notes for myself. I'm happy with the results. In-part, I'm optimizing for compatibility. I haven't tried the generated code yet on my oldest box that's running an AMD Phenom 8650 x3.
Apparently, according to the linked wiki article about x86-64 I could use march=x86-64-v2 instead of just march=x86-64. _________________ Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi |
|
Back to top |
|
|
christoph_peter_s Tux's lil' helper
Joined: 30 Nov 2015 Posts: 106
|
Posted: Thu Mar 14, 2024 10:43 am Post subject: |
|
|
pietinger wrote: | Take the simple:
Code: | COMMON_FLAGS="-march=native -O2 -pipe" |
and gcc will recognize your CPU and optimize it for this CPU (and -mtune is then no longer necessary at all). |
Unless one wants to use distcc, this seems the best approach.
One may also want to read https://wiki.gentoo.org/wiki/Safe_CFLAGS, or even https://wiki.gentoo.org/wiki/Category:Processors (depending on which applies to You).
Having said that, I admit to have spent a big lot of time to tweak a box with a rather ancient Pentium G5258. This was an unlocked CPU, which Intel sold as anniversary Pentium. I did easily reach 4.5 GHz clock rate, which gave it an extreme single core performance. I wanted to have distcc to do faster updates, and thus I first used -march=Haswell. But at least one package (I think it was Rust, but I'm not sure anymore) stubbornly failed to compile. Even when disabling distcc. So I used cpuid2cpuflags - and enabled/disabled a lot of combinations. I knew the issues stemmed from some disabled SSE instructions in that cheap Pentium CPU, but I was not able to overcome the issue.
Simply using -march=native made all compilations run smooth.
Long story short: I replace the CPU to some i5...
So I do strongly agree in the recommendation to use -march=native, whenever there is no strong reason not to use it. |
|
Back to top |
|
|
|