Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Can someone explain the new "multiarch" flag in glibc?
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
curmudgeon
Veteran
Veteran


Joined: 08 Aug 2003
Posts: 1741

PostPosted: Sun Oct 21, 2018 10:19 pm    Post subject: Can someone explain the new "multiarch" flag in gl Reply with quote

In particular, does it apply only to "similar" architectures (such as amd64 and x86) or just about anything (say amd64 and arm64)?

Thank you in advance.
Back to top
View user's profile Send private message
jagdpanther
l33t
l33t


Joined: 22 Nov 2003
Posts: 729

PostPosted: Mon Oct 22, 2018 2:47 pm    Post subject: Reply with quote

Yes, please explain. The only comment in /usr/portage/profiles is:

Quote:
use.local.desc:sys-libs/glibc:multiarch - enable single DSO with optimizations for multiple architectures


Is this needed if your run 32-bit apps on your x86_64 system or am I thinking of multilib ?
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1249

PostPosted: Mon Oct 22, 2018 3:16 pm    Post subject: Reply with quote

AFAIK, it means that glibc is optimized for different CPU archs, i.e. AMD's CPUs and Intel's CPUs.
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Mon Oct 22, 2018 10:00 pm    Post subject: Reply with quote

Perfect Gentleman wrote:
AFAIK, it means that glibc is optimized for different CPU archs, i.e. AMD's CPUs and Intel's CPUs.
Then why isn't default OFF since the whole point of compiling everything is to optimize for a particular CPU?
Not challenging, I'd realy like a technical explanation. I can see this for a binary distro, it would be pretty much mandatory.
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6227
Location: Room 101

PostPosted: Mon Oct 22, 2018 10:15 pm    Post subject: Reply with quote

Tony0945 wrote:
Perfect Gentleman wrote:
AFAIK, it means that glibc is optimized for different CPU archs, i.e. AMD's CPUs and Intel's CPUs.

Then why isn't default OFF since the whole point of compiling everything is to optimize for a particular CPU? Not challenging, I'd realy like a technical explanation. I can see this for a binary distro, it would be pretty much mandatory.

Tony0945 ... it's off because you already get optimisation for your target arch (so, singularly optimised), and would only need 'multiarch' if you needed a "single DSO optimi[sed] for multiple architectures". It's only, or mostly, a feature for bindists who distribute a "single DSO" that runs on multiple architectures.

best ... khay
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Mon Oct 22, 2018 11:09 pm    Post subject: Reply with quote

khayyam wrote:
Tony0945 ... it's off because you already get optimisation for your target arch (so, singularly optimised), and would only need 'multiarch' if you needed a "single DSO optimi[sed] for multiple architectures". It's only, or mostly, a feature for bindists who distribute a "single DSO" that runs on multiple architectures.
Code:
IUSE="audit caps compile-locales doc gd hardened headers-only +multiarch multilib nscd profile selinux suid systemtap vanilla"
It's default ON because of the '+'.
Is it OK to turn it off with package.use ?

And BTW, what does DSO mean? I don't think it means "Defense Secretary Office".
Back to top
View user's profile Send private message
khayyam
Watchman
Watchman


Joined: 07 Jun 2012
Posts: 6227
Location: Room 101

PostPosted: Mon Oct 22, 2018 11:39 pm    Post subject: Reply with quote

khayyam wrote:
Tony0945 ... it's off because you already get optimisation for your target arch (so, singularly optimised), and would only need 'multiarch' if you needed a "single DSO optimi[sed] for multiple architectures". It's only, or mostly, a feature for bindists who distribute a "single DSO" that runs on multiple architectures.

Tony0945 wrote:
Code:
IUSE="audit caps compile-locales doc gd hardened headers-only +multiarch multilib nscd profile selinux suid systemtap vanilla"

It's default ON because of the '+'. Is it OK to turn it off with package.use ?

Tony0945 ... I totally misread your post, most probably because I honestly can't see a reason why it would be enabled by default. As for turning it off, I would say yes, but then I should probably offer caution, this is the era of the new and all that comes with it after all :)

Tony0945 wrote:
And BTW, what does DSO mean? I don't think it means "Defense Secretary Office".

dynamic shared object.

best ... khay
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Mon Oct 22, 2018 11:45 pm    Post subject: Reply with quote

Not sure what it does, but it seems to be in the fpu area of the source code. Not sure if it's necessary or not.


ETA:
Quote:
What is this Multiarch?

Multiarch lets you install library packages from multiple architectures on the same machine. This is useful in various ways, but the most common is installing both 64 and 32-bit software on the same machine and having dependencies correctly resolved automatically. In general you can have libraries of more than one architecture installed together and applications from one architecture or another installed as alternatives. Note that it does not enable multiple architecture versions of applications to be installed simultaneously.


https://wiki.debian.org/Multiarch/HOWTO

Maybe they're trying to fix the 64/32 bit jungle that they created.
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21633

PostPosted: Tue Oct 23, 2018 1:39 am    Post subject: Reply with quote

I think the Debian multiarch is a different project. That one is intended to store architecture-specific libraries at paths that tell you the architecture. Historically, we had /usr/lib64/libNAME.so, and the architecture was simply understood to be "whatever 64-bit code the native system can run." However, there are multiple mutually incompatible 64-bit CPUs in existence (amd64, ppc64, arm64, etc.). Non-native 64-bit was consigned to a longer path, so the path to a library depended on whether it was native or a cross-compiled library. Multiarch proposes that you install the library as /usr/lib/x86_64-pc-linux-gnu/libNAME.so for amd64, so that all the 64-bit architecture's libraries can be co-installed as peers and not cause file collisions.

From looking at the glibc ebuild, it looks like its use of multiarch is intended to compile multiple copies of selected important functions, then runtime resolve to the best implementation for the current CPU. Assuming the build system is otherwise able to respect the local administrator's build flags, this seems unnecessary for Gentoo systems, except in the case that you build a package and install it on several different mostly-compatible CPUs (e.g. an Atom, an Intel Core i3, and an AMD Ryzen -- all in the x86 family, but each with their own quirks and possibly different "best" ways of doing a job).
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Tue Oct 23, 2018 9:49 am    Post subject: Reply with quote

Hu wrote:
Assuming the build system is otherwise able to respect the local administrator's build flags, this seems unnecessary for Gentoo systems, except in the case that you build a package and install it on several different mostly-compatible CPUs (e.g. an Atom, an Intel Core i3, and an AMD Ryzen -- all in the x86 family, but each with their own quirks and possibly different "best" ways of doing a job).

I disagree with you there Hu, the same cpu could offer very different implementations of a function because the cpu handle mutli-code (here i'm speaking of mmx, sse...) that may optimize the function in some way.
And because of the cpu internal (architecture, cache size...), you "may" not bet without check of what implementation would work the best (ie: the function made with sse could work worst or better than the same function using just mmx)

This can "somehow" be seen already with mdraid that do these kind of tests to also pickup the best implementation to use on the cpu considering the codes the cpu could run.
(not from my dmesg, i don't use mdraid myself)
Code:
raid6: int32x1    869 MB/s
raid6: int32x2    927 MB/s
raid6: int32x4    676 MB/s
raid6: int32x8    643 MB/s
raid6: mmxx1     3071 MB/s
raid6: mmxx2     3413 MB/s
raid6: sse1x1    2033 MB/s
raid6: sse1x2    2573 MB/s
raid6: sse2x1    3710 MB/s
raid6: sse2x2    3909 MB/s
raid6: using algorithm sse2x2 (3909 MB/s)
xor: automatically using best checksumming function: pIII_sse
   pIII_sse  :  8767.200 MB/sec
xor: using function: pIII_sse (8767.200 MB/sec)


This still raise more questions (on how they implement this) but that is a different subject
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Tue Oct 23, 2018 10:08 am    Post subject: Reply with quote

There sure is a dearth of info on exactly what it is, and the fact that the term is used in seemingly different ways by different (linux) companies is discouraging.

But this seems to be a llittle clearer (not much though)
Quote:
The option `--enable-multi-arch` in glibc is about
enabling multiple architectures in the sense of IFUNC
support, but not `multiarch` (one word) in the sense
of gcc or the runtime.

Not to be confused with multilib, either in the sense
of multiple runtimes for gcc, or in the rpm sense of
multiple runtimes e.g. i386 and x86-64.

https://sourceware.org/ml/libc-help/2014-12/msg00003.html

Makes me wonder what IFUNC support is though.

Note: In looking at the ebuild they use multiarch as a flag, but what's sent to configure is enable/disable-multi-arch (with the hyphen) :roll:


Edit to add: a quick search yields
Quote:
What is an indirect function (IFUNC)?

The GNU indirect function support (IFUNC) is a feature of the GNU toolchain that allows a developer to create multiple implementations of a given function and to select amongst them at runtime using a resolver function which is also written by the developer. The resolver function is called by the dynamic loader during early startup to resolve which of the implementations will be used by the application. Once an implementation choice is made it is fixed and may not be changed for the lifetime of the process.

https://sourceware.org/glibc/wiki/GNU_IFUNC
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
jagdpanther
l33t
l33t


Joined: 22 Nov 2003
Posts: 729

PostPosted: Tue Oct 23, 2018 4:20 pm    Post subject: Reply with quote

I am still confused. For those of us who only build glibc for use on the system it is compiled (emerged) on and who are interested in execution speed (and to a lesser degree file size) should we turn off the multiarch use flag for glibc? (ie. -multiarch)
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Tue Oct 23, 2018 4:46 pm    Post subject: Reply with quote

jagdpanther wrote:
I am still confused. For those of us who only build glibc for use on the system it is compiled (emerged) on and who are interested in execution speed (and to a lesser degree file size) should we turn off the multiarch use flag for glibc? (ie. -multiarch)


++
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21633

PostPosted: Wed Oct 24, 2018 2:06 am    Post subject: Reply with quote

krinn wrote:
Hu wrote:
Assuming the build system is otherwise able to respect the local administrator's build flags, this seems unnecessary for Gentoo systems, except in the case that you build a package and install it on several different mostly-compatible CPUs (e.g. an Atom, an Intel Core i3, and an AMD Ryzen -- all in the x86 family, but each with their own quirks and possibly different "best" ways of doing a job).

I disagree with you there Hu, the same cpu could offer very different implementations of a function because the cpu handle mutli-code (here i'm speaking of mmx, sse...) that may optimize the function in some way.
And because of the cpu internal (architecture, cache size...), you "may" not bet without check of what implementation would work the best (ie: the function made with sse could work worst or better than the same function using just mmx)
I am a bit confused about your disagreement. Are you saying that a given physical CPU will vary the best implementation over time? My point above was that there are many x86 compatible CPUs, and they do not all excel at the same instructions, so different implementations work better for different groups. Your elaboration agrees with that: just because two CPUs both support mmx and sse, it does not follow that both of them will perform better using mmx than sse (or vice versa). It depends on what tradeoffs the CPU manufacturers made. Ideally, the compiler should know the best implementation for each CPU family and emit code accordingly. Indirect functions are a way to handle that the user's use case does not allow the compiler to pick a single "ideal" version, when the user wants to use a variety of CPUs that do not all agree on what is ideal.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Wed Oct 24, 2018 4:04 am    Post subject: Reply with quote

Hu wrote:
I am a bit confused about your disagreement. Are you saying that a given physical CPU will vary the best implementation over time?

With Intel's crazy microcode patching as of late, it's not impossible...
Back to top
View user's profile Send private message
geki
Advocate
Advocate


Joined: 13 May 2004
Posts: 2387
Location: Germania

PostPosted: Wed Oct 24, 2018 5:29 am    Post subject: Reply with quote

krinn wrote:
Code:
raid6: int32x1    869 MB/s
raid6: int32x2    927 MB/s
raid6: int32x4    676 MB/s
raid6: int32x8    643 MB/s
raid6: mmxx1     3071 MB/s
raid6: mmxx2     3413 MB/s
raid6: sse1x1    2033 MB/s
raid6: sse1x2    2573 MB/s
raid6: sse2x1    3710 MB/s
raid6: sse2x2    3909 MB/s
raid6: using algorithm sse2x2 (3909 MB/s)
xor: automatically using best checksumming function: pIII_sse
   pIII_sse  :  8767.200 MB/sec
xor: using function: pIII_sse (8767.200 MB/sec)

This still raise more questions (on how they implement this) but that is a different subject
since you ask, something easy like this (select and/or switch implementation at runtime):
https://forums.gentoo.org/viewtopic-p-8206816.html#8206816
_________________
hear hear
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Wed Oct 24, 2018 9:30 am    Post subject: Reply with quote

Hu: my disagreement was with
Quote:
Assuming the build system is otherwise able to respect the local administrator's build flags, this seems unnecessary for Gentoo systems

As (i think) you were assuming it would be useful only in gentoo for case where user will run that code on another cpu.
While i was pointing it would still be useful because even portage and the user (thru cflags & X86_FLAGS) will hint the program about cpu capabilities, the user himself cannot know what implementation would do best.

geki:
I think the projects aim share a common ground, but not common goal.
The first should only build function versions the cpu could handle, in order to pickup the best at runtime
The second should build every versions of a function that was implemented in order to run in every cpu ; as such will add unwanted bloated in a gentoo system (you don't need the function version optimize for sse2 on a cpu that cannot handle sse2) ; but someone doing that FMV, could also be smart and add some speed optimization detection of what function to use on a cpu capable of running multi-code implementation (like a cpu able to run mmx and sse).

What i was thinking behind my "it raise more questions" was more about implementation of that, as a function may run better depending on the given arguments pass to it ; and it could be seen easy as everyone could safely assume sse should run better than mmx, assuming sse win would be bad in practice (that's the same bad assumption made to assume gcc -O3 will always gives better result than -O2, in theory it is, in practice, it's not).

And my second question i had in mind was already answered by Anon-E-moose quote in IFUNC, as i was worried about implementation of testing with the Once an implementation choice is made it is fixed and may not be changed for the lifetime of the process
Because if you are making test each time the function is called, the test will draw any benefits from their results (you test if mmx or sse is better, but it has a speed cost doing the testing, voiding the speed gain from branching to the best function).

Result of this (for me), is that it looks good on paper, but i'm less sure it do good finally. Testing if a function run faster or slower in mmx or see on a busy cpu may not gives the best answer for the given cpu, while people try to run benchmark on a non busy cpu to not disturb the test, glibc will run its tests when the function is called, with random results in practice as the cpu may be working hard doing something else ; with the end result that glibc will say "mmx version run faster", while in real, the sse one will always do better, but not at the time the tests were made.
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Wed Oct 24, 2018 2:38 pm    Post subject: Reply with quote

Wait wait! So the loader can determine the best implementation of a function, but the compiler can't?
I find that hard to believe. Why does the compiler attempt optimization at all?

As for Intel microcode, all my machines are AMD, and "the crazy Intel microcode" encourages me to NOT buy an Intel in the future.


Last edited by Tony0945 on Wed Oct 24, 2018 9:20 pm; edited 1 time in total
Back to top
View user's profile Send private message
geki
Advocate
Advocate


Joined: 13 May 2004
Posts: 2387
Location: Germania

PostPosted: Wed Oct 24, 2018 5:01 pm    Post subject: Reply with quote

krinn wrote:
I think the projects aim share a common ground, but not common goal.
The first should only build function versions the cpu could handle, in order to pickup the best at runtime
The second should build every versions of a function that was implemented in order to run in every cpu ; as such will add unwanted bloated in a gentoo system (you don't need the function version optimize for sse2 on a cpu that cannot handle sse2) ; but someone doing that FMV, could also be smart and add some speed optimization detection of what function to use on a cpu capable of running multi-code implementation (like a cpu able to run mmx and sse).
yes, vectorclass supports function selection by arch, i.e. for x86 or x86_64. so it is possible to build one binary supporting complete intel and amd cpu instruction set for respective arch. I use binhost with clients. so I like that feature enabling more cpu features on that clients than on binhost. FMV seems to support rather library selection for different archs in one binary or dso, dynamically loading arch dependent dso. is it possible to build a binary, which can be executed on x86, x86_64, arm, arm64. ppc*? if not FMV seems to be superfluous to me. if you use FMV for mmx versus sse switcheroo - thats simply wrong.

well, answer held generic - not specific to krinn. I still try to classify this feature. :o
_________________
hear hear
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Wed Oct 24, 2018 10:42 pm    Post subject: Reply with quote

Code:
 # equery d glibc
 * These packages depend on glibc:
dev-java/oracle-jre-bin-1.8.0.162-r1 (!prefix ? sys-libs/glibc)
dev-libs/libev-4.24 (elibc_glibc ? >=sys-libs/glibc-2.9_p20081201)
net-fs/autofs-5.1.4 (elibc_glibc ? sys-libs/glibc[rpc(-)])
sys-apps/iproute2-4.18.0 (elibc_glibc ? >=sys-libs/glibc-2.7)
sys-devel/gcc-6.4.0-r4 (elibc_glibc ? >=sys-libs/glibc-2.13)
sys-devel/gcc-7.3.0-r5 (elibc_glibc ? >=sys-libs/glibc-2.13)
sys-devel/gcc-8.2.0-r3 (elibc_glibc ? >=sys-libs/glibc-2.13)
sys-libs/tevent-0.9.37 (elibc_glibc ? <sys-libs/glibc-2.26[rpc(+)])
virtual/libc-1 (elibc_glibc ? sys-libs/glibc:2.2)
www-plugins/adobe-flash-31.0.0.122 (nsplugin ? >=sys-libs/glibc-2.4)
Seems like this would help only java and adobe-flash.
Re-emerging glibc with -multiarch
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21633

PostPosted: Thu Oct 25, 2018 1:53 am    Post subject: Reply with quote

Tony0945 wrote:
Wait wait! So the loader can determine the best implementation of a function, but the compiler can't?
I find that hard to believe. Why does the compiler attempt optimization at all?
The loader, by definition, runs on the system where the program is loaded for execution. The compiler may or may not. The loader can inspect the model identifying data of the host CPU, cross-check that against hardcoded rules for what is best on that family, and pick an implementation accordingly. The compiler can know what is best for each family, but it can't know on which family you will ultimately run the code. (Using -march gives it strong hints, but even that is only establishing a partial bound. You are permitted to run on a CPU that is substantially better than the one specified via -march.) The compiler optimizes as best it can with the information available. In the general case, it cannot even be sure that you will run the code only on one family of CPU. You might upgrade the CPU or migrate the hard drive, and not recompile the code. On the other hand, if you migrate to a new CPU, you will necessarily reboot, and when you do, the loader gets a chance to pick the best variant of the code for your current CPU, ignoring what CPU family you had when you compiled the code.
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Thu Oct 25, 2018 2:04 am    Post subject: Reply with quote

Thank you, Hu. I understand how multiarch can be usefull in many cases, especially when running generic binary code.
In my use case, and I'm sure many others, I compile with -march=native, letting gcc figure out the best code and when I upgrade to a new CPU, I run "emerge -e @world" still with -march=native.

Edit: I notice after recompiling glibc (with -multiarch) and recompiling Thunderbird, that Thunderbird loads much faster without the artifacts that it previous displayed that I attributed to T-bird itself.
Back to top
View user's profile Send private message
The Main Man
Veteran
Veteran


Joined: 27 Nov 2014
Posts: 1166
Location: /run/user/1000

PostPosted: Thu Oct 25, 2018 9:09 am    Post subject: Reply with quote

Running steam for example would require glibc multiarch.
Actually any x86 code running on amd64
I guess at least :roll:
Back to top
View user's profile Send private message
Tony0945
Watchman
Watchman


Joined: 25 Jul 2006
Posts: 5127
Location: Illinois, USA

PostPosted: Thu Oct 25, 2018 2:28 pm    Post subject: Reply with quote

kajzer wrote:
Running steam for example would require glibc multiarch.
Actually any x86 code running on amd64
I guess at least :roll:

Wouldn't that be the multilib flag not multiarch ?
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6098
Location: Dallas area

PostPosted: Thu Oct 25, 2018 3:00 pm    Post subject: Reply with quote

Tony0945 wrote:
Thank you, Hu. I understand how multiarch can be usefull in many cases, especially when running generic binary code.
In my use case, and I'm sure many others, I compile with -march=native, letting gcc figure out the best code and when I upgrade to a new CPU, I run "emerge -e @world" still with -march=native.

Edit: I notice after recompiling glibc (with -multiarch) and recompiling Thunderbird, that Thunderbird loads much faster without the artifacts that it previous displayed that I attributed to T-bird itself.


I didn't notice any artifacts (but I'm running an older tbird anyway) but I did go ahead and recompile glibc without multiarch.
I don't see a need for it, at least on my system.
_________________
PRIME x570-pro, 3700x, 6.1 zen kernel
gcc 13, profile 17.0 (custom bare multilib), openrc, wayland
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum