Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[Solved] Clarification on -march & CPU_FLAGS
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
Barbas
n00b
n00b


Joined: 11 Nov 2018
Posts: 4

PostPosted: Sun Nov 11, 2018 2:29 am    Post subject: [Solved] Clarification on -march & CPU_FLAGS Reply with quote

Hello,

is there any more CPU optimization going on within the -march option in make.conf/COMMON_FLAGS= beyond what offer the CPU_FLAGS_X86 flags? I mean would there be a performance difference between:
  • Setting the CPU-specific -march=native option complemented with only not detected CPU instruction sets in CPU_FLAGS_X86 with the help of cpuid2cpuflags?
  • Setting the generic -march=x86_64 option complemented with all the CPU instruction sets in CPU_FLAGS_X86 found by cpuid2cpuflags?


In other words, how much is the code optimized for the CPU sub-architecture like skylake, broadwell, ...? Because looking at USE flags available for relevant ebuilds in the emerge output, those CPU-instruction-sets-specific USE flags like mmx, mmxext, sse, sse2, sse3, ... are used very rarely and most ebuilds have no such USE flags available for them so it looks to me like the code produced by emerge/GCC is mostly generic except few packages but please correct me on this.

I'm not a programmer nor I know how compilation or CPU works so thanks for clarification.


Last edited by Barbas on Sun Nov 11, 2018 11:02 am; edited 1 time in total
Back to top
View user's profile Send private message
Muso
Veteran
Veteran


Joined: 22 Oct 2002
Posts: 1052
Location: The Holy city of Honolulu

PostPosted: Sun Nov 11, 2018 6:47 am    Post subject: Reply with quote

x86_64 us for the most generic of builds. It is essentially what you get when running any debian clone (ubuntu, Mint, elementary, etc etc) or RH, SuSE...

native will probe your cpu and set those values for a more fine tuned system for your specific cpu. I'll spend those few extra seconds looking at the safe CFLAG page mentioned in the handbook and assign what is listed to the CFLAGS parameter of make.conf.

Just run lscpu to find the full details so your search on the safe cflags page is quick.
_________________
"You can lead a horticulture but you can't make her think" ~ Dorothy Parker
2021 is the year of the Linux Desktop!
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Sun Nov 11, 2018 10:06 am    Post subject: Reply with quote

-march=model lets the compiler optimise for a given processor's pipeline depth, latency per instruction, and ILP characteristics.
-march=native also informs the compiler of cache layout (which can vary between different models of the same chip, e.g. celerons), so it knows when passes like inlining would be beneficial.
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2008

PostPosted: Sun Nov 11, 2018 10:29 am    Post subject: Re: Clarification on -march & CPU_FLAGS Reply with quote

Atronach wrote:
Hello,

is there any more CPU optimization going on within the -march option in make.conf/COMMON_FLAGS= beyond what offer the CPU_FLAGS_X86 flags?
...

I presume by COMMON_FLAGS you mean CFLAGS and CXXFLAGS. They're passed to gcc, and it generates code accordingly, whatever valid C/C++ code it's given.
CPU_FLAGS_X86 are passed to the package's make process, and affect what C/C++ code is compiled. So, say you have a package that looks for SSE2 in CPU_FLAGS_X86, it will probably select a different algorithm and source code to exploit SSE2 instructions. Whereas in SSE2 in CFLAGS tells the compiler it can use SSE2 instructions where it can find an advantage in doing so. The compiler isn't clever enough/entitled to rewrite the code to implement a better algorithm, but it will use the instructions to implement a given algorithm in a faster way - but I suspect that's rare. I think there's some macro-level handling going on too, to allow the source code to select algorithms depending on which flags are set.
_________________
Greybeard
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54238
Location: 56N 3W

PostPosted: Sun Nov 11, 2018 10:32 am    Post subject: Reply with quote

Atronach,

CFLAGS and CPU_FLAGS_X86 do different things.

CFLAGS give gcc instructions (do this) and permissions (do this if you want).
CPU_FLAGS_X86 used to be USE flags, they still are but have become a USE expand, they operate the same way.
CPU_FLAGS_X86 instruct build systems to use hand optimised code segments that will make use of that particular instruction set extension.

CFLAGS are followed by gcc for everything. CPU_FLAGS_X86 apply to a few packages.

Now to your question.

Run app-portage/cpuid2cpuflags to determine your CPU_FLAGS_X86
Run
Code:
gcc -### -E - -march=native 2>&1 | sed -r '/cc1/!d;s/(")|(^.* - )|( -mno-[^\ ]+)//g'
to see what -march=native does for you.

I get
Code:
$ gcc -### -E - -march=native 2>&1 | sed -r '/cc1/!d;s/(")|(^.* - )|( -mno-[^\ ]+)//g'
-march=amdfam10 -mmmx -m3dnow -msse -msse2 -msse3 -msse4a -mcx16 -msahf -mpopcnt -mabm -mlzcnt -mprfchw -mfxsr --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=amdfam10
on a Phenom II.
So there is some CPU cache tuning going on there. However, I don't know if -march=amdfam10 infers the --param l1-cache-line-size=64 --param l2-cache-size=512 too.

-- Edit --

Goverp got there first :)
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Barbas
n00b
n00b


Joined: 11 Nov 2018
Posts: 4

PostPosted: Sun Nov 11, 2018 11:01 am    Post subject: Reply with quote

Thanks to you all for your elaborate answers. Even though I obviously don't understand all the terminology like pipelining, ILP, inlining or cache layout, I can see now that both -march and CPU_FLAGS do their own portions of optimizations and even with empty CPU_FLAGS, the -march ensures optimization for all the code while CPU_FLAGS do something extra for few specific packages. That's enough for me as a regular user. Solved.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum