Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Instruction sets - Core i7 - Gcc 4.6.1
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
Well.Heeled.Man
n00b
n00b


Joined: 25 Aug 2011
Posts: 19

PostPosted: Mon Sep 19, 2011 5:03 pm    Post subject: Instruction sets - Core i7 - Gcc 4.6.1 Reply with quote

Hi Folks,

I get a new laptop with a Core i7 2630QM (2nd Gen.). I upgraded to and rebuilt my system against GCC 4.6.1 with march=native; however, "emerge --info" does not show all the instruction sets (notably ssse4.2 and avx seem to be missing). I thought march=native worked with this version of GCC and the second generation Cores.

First of all, is it using all of the instruction sets supported by my processor? Why are all of them not showing in emerge --info with march=native (or march=corei7=avx)? Should I add the supported instruction sets to my use flags? If so would anyone care to share what they use with their 2nd Gens.

Cheers,

Scott
Back to top
View user's profile Send private message
Genone
Retired Dev
Retired Dev


Joined: 14 Mar 2003
Posts: 9501
Location: beyond the rim

PostPosted: Tue Sep 20, 2011 6:16 am    Post subject: Reply with quote

Where do you think that information should be shown? If you're talking about USE flags then no, those have nothing to do with gcc flags, and adding non-existant use flags won't do anything. -march=native is a gcc option, so you'll have to ask gcc about what it expands to.
Back to top
View user's profile Send private message
Well.Heeled.Man
n00b
n00b


Joined: 25 Aug 2011
Posts: 19

PostPosted: Tue Sep 20, 2011 10:37 am    Post subject: Reply with quote

It was in the USE flag list. The KDE desktop profile list mmx, sse and sse2 in USE flags, but I thought these were processor instruction sets, not use flags. If these are here, why not the rest?
Back to top
View user's profile Send private message
wrc1944
Advocate
Advocate


Joined: 15 Aug 2002
Posts: 3432
Location: Gainesville, Florida

PostPosted: Tue Sep 20, 2011 11:21 am    Post subject: Reply with quote

For reference (excerpt from man:gcc, from the gcc-4.6.1 release):
Code:
Intel 386 and AMD x86-64 Options

These -m options are defined for the i386 and x86-64 family of computers:

-mtune=cpu-type
Tune to cpu-type everything applicable about the generated code, except for the ABI and the set of available instructions. The choices for cpu-type are:

generic
Produce code optimized for the most common IA32/AMD64/EM64T processors. If you know the CPU on which your code will run, then you should use the corresponding -mtune option instead of -mtune=generic. But, if you do not know exactly what CPU users of your application will have, then you should use this option.

As new processors are deployed in the marketplace, the behavior of this option will change. Therefore, if you upgrade to a newer version of GCC, the code generated option will change to reflect the processors that were most common when that version of GCC was released.

There is no -march=generic option because -march indicates the instruction set the compiler can use, and there is no generic instruction set applicable to all processors. In contrast, -mtune indicates the processor (or, in this case, collection of processors) for which the code is optimized.

native
This selects the CPU to tune for at compilation time by determining the processor type of the compiling machine. Using -mtune=native will produce code optimized for the local machine under the constraints of the selected instruction set. Using -march=native will enable all instruction subsets supported by the local machine (hence the result might not run on different machines).

i386
Original Intel's i386 CPU.

i486
Intel's i486 CPU. (No scheduling is implemented for this chip.)

i586, pentium
Intel Pentium CPU with no MMX support.

pentium-mmx
Intel PentiumMMX CPU based on Pentium core with MMX instruction set support.

pentiumpro
Intel PentiumPro CPU.

i686
Same as "generic", but when used as "march" option, PentiumPro instruction set will be used, so the code will run on all i686 family chips.

pentium2
Intel Pentium2 CPU based on PentiumPro core with MMX instruction set support
.
pentium3, pentium3m
Intel Pentium3 CPU based on PentiumPro core with MMX and SSE instruction set support.

pentium-m
Low power version of Intel Pentium3 CPU with MMX, SSE and SSE2 instruction set support. Used by Centrino notebooks.

pentium4, pentium4m
Intel Pentium4 CPU with MMX, SSE and SSE2 instruction set support.

prescott
Improved version of Intel Pentium4 CPU with MMX, SSE, SSE2 and SSE3 instruction set support.

nocona
Improved version of Intel Pentium4 CPU with 64-bit extensions, MMX, SSE, SSE2 and SSE3 instruction set support.

core2
Intel Core2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and SSSE3 instruction set support.

corei7
Intel Core i7 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 instruction set support.

corei7-avx
Intel Core i7 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AES and PCLMUL instruction set support.

atom
Intel Atom CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and SSSE3 instruction set support.

k6
AMD K6 CPU with MMX instruction set support.

k6-2, k6-3
Improved versions of AMD K6 CPU with MMX and 3DNow! instruction set support.

athlon, athlon-tbird
AMD Athlon CPU with MMX, 3dNOW!, enhanced 3DNow! and SSE prefetch instructions support.

athlon-4, athlon-xp, athlon-mp
Improved AMD Athlon CPU with MMX, 3DNow!, enhanced 3DNow! and full SSE instruction set support.

k8, opteron, athlon64, athlon-fx
AMD K8 core based CPUs with x86-64 instruction set support. (This supersets MMX, SSE, SSE2, 3DNow!, enhanced 3DNow! and 64-bit instruction set extensions.)

k8-sse3, opteron-sse3, athlon64-sse3
Improved versions of k8, opteron and athlon64 with SSE3 instruction set support.

amdfam10, barcelona
AMD Family 10h core based CPUs with x86-64 instruction set support. (This supersets MMX, SSE, SSE2, SSE3, SSE4A, 3DNow!, enhanced 3DNow!, ABM and 64-bit instruction set extensions.)

winchip-c6
IDT Winchip C6 CPU, dealt in same way as i486 with additional MMX instruction set support.

winchip2
IDT Winchip2 CPU, dealt in same way as i486 with additional MMX and 3DNow! instruction set support.

c3
Via C3 CPU with MMX and 3DNow! instruction set support. (No scheduling is implemented for this chip.)

c3-2
Via C3-2 CPU with MMX and SSE instruction set support. (No scheduling is implemented for this chip.)

geode
Embedded AMD CPU with MMX and 3DNow! instruction set support.


While picking a specific cpu-type will schedule things appropriately for that particular chip, the compiler will not generate any code that does not run on the i386 without the -march=cpu-type option being used.

-march=cpu-type
Generate instructions for the machine type cpu-type. The choices for cpu-type are the same as for -mtune. Moreover, specifying -march=cpu-type implies -mtune=cpu-type.

_________________
Main box- AsRock x370 Gaming K4
Ryzen 7 3700x, 3.6GHz, 16GB GSkill Flare DDR4 3200mhz
Samsung SATA 1000GB, Radeon HD R7 350 2GB DDR5
OpenRC Gentoo ~amd64 plasma, glibc-2.36-r7, gcc-13.2.1_p20230304
kernel-6.7.2 USE=experimental python3_11
Back to top
View user's profile Send private message
wrc1944
Advocate
Advocate


Joined: 15 Aug 2002
Posts: 3432
Location: Gainesville, Florida

PostPosted: Tue Sep 20, 2011 8:08 pm    Post subject: Reply with quote

As I understand it, emerge --info only reports the default profile and whatever mods you have made in /etc/make.conf.

Also, these readouts don't seem to be too reliable, as they aren't reporting the same thing. On my current system, one reports amdfam10-, the other fam16.

cat /proc/cpuinfo seems to be more explicit (pni is aka sse3).
I used to put sse sse2 sse3 and others in make.conf that according to man:gcc corresponded top my current cpu, but since I started using march=native I haven't. However, now I'm thinking maybe "native" isn't actually passing all the flags the cpu is capable of utilizing.

Come to think of it, when I manually added flags in make.conf the flags would show up in the gcc output while compiling, but with "native" I don't really recall seeing the "full set" as cpuinfo reports. Of course I've always realized gentoo USE flags and gcc cflags are two different things, but there doesn't seem to be hard and fast rules defining what actually shows up in the gcc output, as regard to what's in make.conf and it's relationship to what the cpu and/or gcc is actually using or capable of. :roll: Sure would like some more reliable info on all this.

Code:
wrc@localhost ~ $ echo "" | gcc -march=native -v -E - 2>&1 | grep cc1
 /usr/libexec/gcc/i686-pc-linux-gnu/4.6.1/cc1 -E -quiet -v - -D_FORTIFY_SOURCE=2 -march=amdfam10 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mabm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mno-avx -mno-sse4.2 -mno-sse4.1 --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=amdfam10
 
wrc@localhost ~ $ cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 5
model name      : AMD Athlon(tm) II X4 640 Processor
stepping        : 3
cpu MHz         : 3000.095
cache size      : 512 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save

_________________
Main box- AsRock x370 Gaming K4
Ryzen 7 3700x, 3.6GHz, 16GB GSkill Flare DDR4 3200mhz
Samsung SATA 1000GB, Radeon HD R7 350 2GB DDR5
OpenRC Gentoo ~amd64 plasma, glibc-2.36-r7, gcc-13.2.1_p20230304
kernel-6.7.2 USE=experimental python3_11
Back to top
View user's profile Send private message
Veldrin
Veteran
Veteran


Joined: 27 Jul 2004
Posts: 1945
Location: Zurich, Switzerland

PostPosted: Tue Sep 20, 2011 8:37 pm    Post subject: Reply with quote

Well.Heeled.Man wrote:
It was in the USE flag list. The KDE desktop profile list mmx, sse and sse2 in USE flags, but I thought these were processor instruction sets, not use flags. If these are here, why not the rest?

Because all 64-bit processors support these.
_________________
read the portage output!
If my answer is too concise, ask for an explanation.
Back to top
View user's profile Send private message
Genone
Retired Dev
Retired Dev


Joined: 14 Mar 2003
Posts: 9501
Location: beyond the rim

PostPosted: Tue Sep 20, 2011 9:00 pm    Post subject: Reply with quote

Well.Heeled.Man wrote:
It was in the USE flag list. The KDE desktop profile list mmx, sse and sse2 in USE flags, but I thought these were processor instruction sets, not use flags. If these are here, why not the rest?

There are some hardware related use flags for packages that can make use of hand-crafted assembler code for example (e.g. USE=sse could pull in a patch containing assembler code). But those have absolutely nothing to do with C*FLAGS or any other compiler settings. And if there are no use cases for e.g. a sse4.1 flag then such a flag doesn't exist.
Back to top
View user's profile Send private message
cach0rr0
Bodhisattva
Bodhisattva


Joined: 13 Nov 2008
Posts: 4123
Location: Houston, Republic of Texas

PostPosted: Tue Sep 20, 2011 9:11 pm    Post subject: Reply with quote

to see what -march=native expands to:

Code:

gcc -Q -march=native --help=target


example output from my phenom 9950, gcc 4.5.2

Code:

meat@ricker ~ $ gcc -Q -march=native --help=target |egrep '(march|mtune|enabled)'
  -m64                                  [enabled]
  -m80387                               [enabled]
  -m96bit-long-double                   [enabled]
  -mabm                                 [enabled]
  -malign-stringops                     [enabled]
  -march=                               amdfam10
  -mcx16                                [enabled]
  -mfancy-math-387                      [enabled]
  -mfp-ret-in-387                       [enabled]
  -mfused-madd                          [enabled]
  -mglibc                               [enabled]
  -mhard-float                          [enabled]
  -mieee-fp                             [enabled]
  -mno-sse4                             [enabled]
  -mpopcnt                              [enabled]
  -mpush-args                           [enabled]
  -mred-zone                            [enabled]
  -msahf                                [enabled]
  -mstackrealign                        [enabled]
  -mtls-direct-seg-refs                 [enabled]
  -mtune=                               amdfam10



from there, time to lookup what that specific march value supports
_________________
Lost configuring your system?
dump lspci -n here | see Pappy's guide | Link Stash
Back to top
View user's profile Send private message
wrc1944
Advocate
Advocate


Joined: 15 Aug 2002
Posts: 3432
Location: Gainesville, Florida

PostPosted: Wed Sep 21, 2011 1:38 pm    Post subject: Reply with quote

cach0rr0,
Thanks for the gcc -Q -march=native --help=target command- didn't know about that one. :oops: Tried it, and got:
Code:
 wrc@localhost ~ $ gcc -Q -march=native --help=target
The following options are target specific:
  -m128bit-long-double                  [disabled]
  -m32                                  [enabled]
  -m3dnow                               [disabled]
  -m3dnowa                              [disabled]
  -m64                                  [disabled]
  -m80387                               [enabled]
  -m8bit-idiv                           [disabled]
  -m96bit-long-double                   [enabled]
  -mabi=                     
  -mabm                                 [enabled]
  -maccumulate-outgoing-args            [disabled]
  -maes                                 [disabled]
  -malign-double                        [disabled]
  -malign-functions=         
  -malign-jumps=             
  -malign-loops=             
  -malign-stringops                     [enabled]
  -mandroid                             [disabled]
  -march=                               amdfam10
  -masm=                     
  -mavx                                 [disabled]
  -mbionic                              [disabled]
  -mbmi                                 [disabled]
  -mbranch-cost=             
  -mcld                                 [disabled]
  -mcmodel=                   
  -mcpu=                     
  -mcrc32                               [disabled]
  -mcx16                                [enabled]
  -mdispatch-scheduler                  [disabled]
  -mf16c                                [disabled]
  -mfancy-math-387                      [enabled]
  -mfentry                              [enabled]
  -mfma                                 [disabled]
  -mfma4                                [disabled]
  -mforce-drap                          [disabled]
  -mfp-ret-in-387                       [enabled]
  -mfpmath=                   
  -mfsgsbase                            [disabled]
  -mfused-madd               
  -mglibc                               [enabled]
  -mhard-float                          [enabled]
  -mieee-fp                             [enabled]
  -mincoming-stack-boundary= 
  -minline-all-stringops                [disabled]
  -minline-stringops-dynamically        [disabled]
  -mintel-syntax             
  -mlarge-data-threshold=     
  -mlwp                                 [disabled]
  -mmmx                                 [disabled]
  -mmovbe                               [disabled]
  -mms-bitfields                        [disabled]
  -mno-align-stringops                  [disabled]
  -mno-fancy-math-387                   [disabled]
  -mno-push-args                        [disabled]
  -mno-red-zone                         [disabled]
  -mno-sse4                             [enabled]
  -momit-leaf-frame-pointer             [disabled]
  -mpc                       
  -mpclmul                              [disabled]
  -mpopcnt                              [enabled]
  -mprefer-avx128                       [disabled]
  -mpreferred-stack-boundary=
  -mpush-args                           [enabled]
  -mrdrnd                               [disabled]
  -mrecip                               [disabled]
  -mred-zone                            [enabled]
  -mregparm=                 
  -mrtd                                 [disabled]
  -msahf                                [enabled]
  -msoft-float                          [disabled]
  -msse                                 [disabled]
  -msse2                                [disabled]
  -msse2avx                             [disabled]
  -msse3                                [disabled]
  -msse4                                [disabled]
  -msse4.1                              [disabled]
  -msse4.2                              [disabled]
  -msse4a                               [disabled]
  -msse5                     
  -msseregparm                          [disabled]
  -mssse3                               [disabled]
  -mstack-arg-probe                     [disabled]
  -mstackrealign                        [enabled]
  -mstringop-strategy=       
  -mtbm                                 [disabled]
  -mtls-dialect=             
  -mtls-direct-seg-refs                 [enabled]
  -mtune=                               amdfam10
  -muclibc                              [disabled]
  -mveclibabi=               
  -mvect8-ret-in-mem                    [disabled]
  -mvzeroupper                          [disabled]
  -mxop                                 [disabled]

Pretty confusing, as all the sse options are disabled, even though amdfam10 is enabled. Could this be related to the fact I have to select k8 in my kernel config, as there is no other option to build for k10's? (Had read somewhere a while back that the k10 kernel option hadn't been implemented due to some problems, and was apparently never addressed by the kernel devs).

Should the sse options my athlon II x4 640 propus core is capable of be placed in make.conf, as only using march=native doesn't seem to be doing the job?

Which is appropriate, the -msseX version, or the USE flags sseX, or both? Or maybe switch from -march=native to -march=amdfam10?

Other discussion on similar topic here: https://forums.gentoo.org/viewtopic-t-867539-highlight-amd+fusion.html

Guess I'll also re-read man:gcc (for the umpteenth time) :lol:
_________________
Main box- AsRock x370 Gaming K4
Ryzen 7 3700x, 3.6GHz, 16GB GSkill Flare DDR4 3200mhz
Samsung SATA 1000GB, Radeon HD R7 350 2GB DDR5
OpenRC Gentoo ~amd64 plasma, glibc-2.36-r7, gcc-13.2.1_p20230304
kernel-6.7.2 USE=experimental python3_11
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum