Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
What processor instructions should be in march and mtune?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Installing Gentoo
View previous topic :: View next topic  
Author Message
Oleksa75
n00b
n00b


Joined: 18 Oct 2021
Posts: 15

PostPosted: Tue Mar 12, 2024 6:24 pm    Post subject: What processor instructions should be in march and mtune? Reply with quote

Hello there!

Please, I need some advice.

I've got a new processor and ran for detailed instruction set next command:

Quote:
>># gcc -march=native -v -E - < /dev/null 2>&1 | grep cc1 | perl -pe 's/ -mno-\S+//g; s/^.* - //g;'


The output was next:

-march=alderlake -mmmx -mpopcnt -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx -mavx2 -mfma -mbmi -mbmi2 -maes -mpclmul -mgfni -mvpclmulqdq -madx -mabm -mclflushopt -mclwb -mcx16 -mf16c -mfsgsbase -mfxsr -msahf -mlzcnt -mmovbe -mmovdir64b -mmovdiri -mpku -mprfchw -mptwrite -mrdpid -mrdrnd -mrdseed -mserialize -msha -mshstk -mvaes -mwaitpkg -mxsave -mxsavec -mxsaveopt -mxsaves -mhreset -mavxvnni --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=12288 -mtune=alderlake -dumpbase -

My question is: should I add ALL these parameters into my CFLAGS / CXXFLAGS / RUSTFLAGS / GOFLAGS etc?

I am pretty nub in Gentoo but I want recompile whatever for my new processor.

Thanks you for answer!
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 4157
Location: Bavaria

PostPosted: Tue Mar 12, 2024 6:32 pm    Post subject: Reply with quote

Take the simple:
Code:
COMMON_FLAGS="-march=native -O2 -pipe"

and gcc will recognize your CPU and optimize it for this CPU (and -mtune is then no longer necessary at all).
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
Oleksa75
n00b
n00b


Joined: 18 Oct 2021
Posts: 15

PostPosted: Tue Mar 12, 2024 7:25 pm    Post subject: Reply with quote

pietinger wrote:
Take the simple:
Code:
COMMON_FLAGS="-march=native -O2 -pipe"

and gcc will recognize your CPU and optimize it for this CPU (and -mtune is then no longer necessary at all).


Lieber Herr Pietinger, danke fuer die Antwort! Ich bin Ihnen sehr dankbar!
Back to top
View user's profile Send private message
logrusx
Veteran
Veteran


Joined: 22 Feb 2018
Posts: 1538

PostPosted: Tue Mar 12, 2024 7:35 pm    Post subject: Reply with quote

You can try app-misc/resolve-march-native if you want it "expanded", but it only makes sense if you're configuring distcc or if you're curious.

Best Regards,
Georgi
Back to top
View user's profile Send private message
CaptainBlood
Advocate
Advocate


Joined: 24 Jan 2010
Posts: 3628

PostPosted: Tue Mar 12, 2024 7:50 pm    Post subject: Reply with quote

pietinger is right keeping it simple.

AFAIK, for amd64 arch here -mtune=native -march=core2 does compile for both native and core2 within same generated binaries.

if mtune(native here) part of the code isn't supported, system will attempt to run -march(core2 here) part of the code: e.g. this would allow to bring HD to a lower system (down to core2) for a chroot and a fix.(Remember bash is the first executable called by chroot)

Beware there is a cost however in binary size.

In arm context -march -mtune may have a slightly different meaning.

Thks 4 ur attention, interest & support.
_________________
USE="-* ..." in /etc/portage/make.conf here.
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. "


Last edited by CaptainBlood on Wed Mar 13, 2024 9:01 am; edited 1 time in total
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Tue Mar 12, 2024 8:08 pm    Post subject: Reply with quote

Depends on what you're trying to do.

If the code will always and only run on your current processor then -march=native is fine, and leave mtune alone as it dupes march.

If you want to possibly run it on a variety of (modern) processors then set march to something like x86-64-v3 and set mtune to native.
That will allow the code to run on most modern x86 processors while tuning the code for your particular processor.
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
logrusx
Veteran
Veteran


Joined: 22 Feb 2018
Posts: 1538

PostPosted: Wed Mar 13, 2024 8:25 am    Post subject: Reply with quote

CaptainBlood wrote:
pietinger is right keeping it simple.


Yes, but this way you don't get explanations like yours and Anon-E-moose's and you don't learn much new stuff :)

Best Regards,
Georgi
Back to top
View user's profile Send private message
CaptainBlood
Advocate
Advocate


Joined: 24 Jan 2010
Posts: 3628

PostPosted: Wed Mar 13, 2024 8:48 am    Post subject: Reply with quote

I meant at start.

app-portage/cpuid2cpuflags should help setting e.g. here:
Code:
cat /etc/portage/package.use/00-skylake-flags-monolithic.conf
*/* CPU_FLAGS_X86: aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sse sse2 sse3 sse4_1 sse4_2 ssse3 #skylake

File name is yours. However heading with 00 is a way to ensure definition to be taken into account early enough.

rust & go are # stories.

Thks 4 ur attention, interest & support.
_________________
USE="-* ..." in /etc/portage/make.conf here.
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. "
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2008

PostPosted: Wed Mar 13, 2024 9:17 am    Post subject: Reply with quote

CaptainBlood wrote:
...
AFAIK, for amd64 arch here -mtune=native -march=core2 does compile for both native and core2 within same generated binaries.

if mtune(native here) part of the code isn't supported, system will attempt to run -march(core2 here) part of the code.
...

AFAIK, that combination produces code for core2, and only core2, but will adjust the optimizations according to the timings for the "native" hardware; there's no hardware-conditional code generation (I think there are options for that, but they're not -mtune).
_________________
Greybeard
Back to top
View user's profile Send private message
CaptainBlood
Advocate
Advocate


Joined: 24 Jan 2010
Posts: 3628

PostPosted: Wed Mar 13, 2024 11:02 am    Post subject: Reply with quote

Goverp,

You're likely to be right.
I tried to discuss the point years ago on the forum, but never get a answer clear to my understanding.
So my statement is indeed more of a personal view than a knowledge

Never took time to write, compile and bisect dummy but good enough C code in this respect.

Maybe it's time to focus as I consider this a serious matter.

Thks 4 ur attention, interest & support.
_________________
USE="-* ..." in /etc/portage/make.conf here.
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. "
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Wed Mar 13, 2024 12:30 pm    Post subject: Reply with quote

March sets the base level to compile for, mtune sets optimizations at whatever level you set (generic is default)

So march core2 will run at that level of cpu, most modern processors could also run that binary.
But if you don't set mtune, then you get base level optimizations, ie practically none.
Setting march core2 and mtune skylake (for example) will optimize the core2 level code to run the best on the skylake.
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
CaptainBlood
Advocate
Advocate


Joined: 24 Jan 2010
Posts: 3628

PostPosted: Wed Mar 13, 2024 1:06 pm    Post subject: Reply with quote

As a result
Code:
-march=older -mtune=native
should be considered sub-optimal proportional to the gap between the two settings.:oops:

As a consequence, native matching CPU_FLAGS_X86 influence will be downgraded to "older".

Thks 4 ur attention, interest & support.
_________________
USE="-* ..." in /etc/portage/make.conf here.
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. "
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Wed Mar 13, 2024 2:43 pm    Post subject: Reply with quote

Core 2 is kind of old, I would shoot for the level of Haswell, at least.

But x86_64-v2 to v4 works for more modern processors.
https://en.wikipedia.org/wiki/X86-64 -- scroll down to see more about x85_64-v*

v2 is Intel Nehalem or Amd Bulldozer level

v3 is Intel Haswell or Amd Excavator level

v4 is Intel Skylake or Amd Zen4 level

setting these at the march level gets you general levels of performance, then tune on top of this.
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 4157
Location: Bavaria

PostPosted: Wed Mar 13, 2024 2:48 pm    Post subject: Reply with quote

Anon-E-moose wrote:
v4 is Intel Skylake or Amd Zen4 level

Are you sure ? AFAIK (but I can be wrong here) v4 needs AVX512 which is not supported by many new Intel CPUs

My i9-13900K (RaptorLake) gives me:
Code:
#  ld.so --help
...
Subdirectories of glibc-hwcaps directories, in priority order:
  x86-64-v4
  x86-64-v3 (supported, searched)
  x86-64-v2 (supported, searched)

_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
CaptainBlood
Advocate
Advocate


Joined: 24 Jan 2010
Posts: 3628

PostPosted: Wed Mar 13, 2024 3:41 pm    Post subject: Reply with quote

-march=core2 here, because alternate system to the skylake one is core2.
I want 2 be able to chroot one HD on another system for fixes.
skylake is headless, so prone to hard 2 fix possible issues, e.g. grub.

But this is a local case.

From a strict performance pov -march=native or equivalent seems king.

Thks 4 ur attention, interest & support.
_________________
USE="-* ..." in /etc/portage/make.conf here.
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. "
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Wed Mar 13, 2024 5:46 pm    Post subject: Reply with quote

pietinger wrote:
Anon-E-moose wrote:
v4 is Intel Skylake or Amd Zen4 level

Are you sure ? AFAIK (but I can be wrong here) v4 needs AVX512 which is not supported by many new Intel CPUs

My i9-13900K (RaptorLake) gives me:
Code:
#  ld.so --help
...
Subdirectories of glibc-hwcaps directories, in priority order:
  x86-64-v4
  x86-64-v3 (supported, searched)
  x86-64-v2 (supported, searched)


I'm going by what the article said.
But having said that intel is a mixed bag, there's 2 skylakes one supports avx512 the other doesn't.
Not sure about the other variations of intel chips.

Interesting specifically talking about the 13600/13900
Quote:
And the new Intel CPUs don't support AVX-512 instructions, something Intel had been pushing in its CPUs for years. AVX-512 is present in the P-cores but permanently turned off because the E-cores don't support it.

_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
logrusx
Veteran
Veteran


Joined: 22 Feb 2018
Posts: 1538

PostPosted: Wed Mar 13, 2024 7:33 pm    Post subject: Reply with quote

pietinger wrote:
Anon-E-moose wrote:
v4 is Intel Skylake or Amd Zen4 level

Are you sure ? AFAIK (but I can be wrong here) v4 needs AVX512 which is not supported by many new Intel CPUs

My i9-13900K (RaptorLake) gives me:
Code:
#  ld.so --help
...
Subdirectories of glibc-hwcaps directories, in priority order:
  x86-64-v4
  x86-64-v3 (supported, searched)
  x86-64-v2 (supported, searched)


AFAIK many new, including last generation, Intel processors are not v4 precisely for that reason :)
And for the same reason there's no v4 binhost repo.

Best Regards,
Georgi
Back to top
View user's profile Send private message
figueroa
Advocate
Advocate


Joined: 14 Aug 2005
Posts: 2963
Location: Edge of marsh USA

PostPosted: Thu Mar 14, 2024 4:23 am    Post subject: Reply with quote

This isn't definitive but I've wasted a lot of energy looking for the right answer for me. Apparently I didn't quite hit it.

I'm not looking for ultimate optimization. I'm not sure I would notice the difference. I'm running all older hardware from circa 2008-2012. What's in my /etc/portage/make.conf is the following:
Code:
# For i7-2600 (HP Pavilion HPE)
#CFLAGS="-O2 -march=x86-64 -mtune=sandybridge -pipe"
# for i5-3470 (hp elite 8300)
#CFLAGS="-O2 -march=x86-64 -mtune=ivybridge -pipe"
#
# Prior to 5/26/2023
#CFLAGS="-O2 -march=native -pipe"
# New 5/26/2023 for i7-2600 (HP Pavilion HPE)
CFLAGS="-O2 -march=x86-64 -mtune=sandybridge -pipe"

Note that everything but the last line is commented out. I like to make notes for myself. I'm happy with the results. In-part, I'm optimizing for compatibility. I haven't tried the generated code yet on my oldest box that's running an AMD Phenom 8650 x3.

Apparently, according to the linked wiki article about x86-64 I could use march=x86-64-v2 instead of just march=x86-64.
_________________
Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi
Back to top
View user's profile Send private message
christoph_peter_s
Tux's lil' helper
Tux's lil' helper


Joined: 30 Nov 2015
Posts: 106

PostPosted: Thu Mar 14, 2024 10:43 am    Post subject: Reply with quote

pietinger wrote:
Take the simple:
Code:
COMMON_FLAGS="-march=native -O2 -pipe"

and gcc will recognize your CPU and optimize it for this CPU (and -mtune is then no longer necessary at all).


Unless one wants to use distcc, this seems the best approach.
One may also want to read https://wiki.gentoo.org/wiki/Safe_CFLAGS, or even https://wiki.gentoo.org/wiki/Category:Processors (depending on which applies to You).

Having said that, I admit to have spent a big lot of time to tweak a box with a rather ancient Pentium G5258. This was an unlocked CPU, which Intel sold as anniversary Pentium. I did easily reach 4.5 GHz clock rate, which gave it an extreme single core performance. I wanted to have distcc to do faster updates, and thus I first used -march=Haswell. But at least one package (I think it was Rust, but I'm not sure anymore) stubbornly failed to compile. Even when disabling distcc. So I used cpuid2cpuflags - and enabled/disabled a lot of combinations. I knew the issues stemmed from some disabled SSE instructions in that cheap Pentium CPU, but I was not able to overcome the issue.
Simply using -march=native made all compilations run smooth.
Long story short: I replace the CPU to some i5...
So I do strongly agree in the recommendation to use -march=native, whenever there is no strong reason not to use it.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Installing Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum