Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Support for Function Multi-Versioning (FMV)
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  

Shall portage support Function Multi-Versioning?
yes
15%
 15%  [ 3 ]
no
40%
 40%  [ 8 ]
don't care
45%
 45%  [ 9 ]
Total Votes : 20

Author Message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Sat May 12, 2018 3:38 pm    Post subject: Reply with quote

steveL wrote:
about enabling extra code paths that can be "dynamically patched"
mv wrote:
How could one do this for an external library?
The library code does it, same as glibc does/used to do, in an init fn. (all modern UNIX apart from MacOS use ELF, iirc.)
mv wrote:
Especially for blas and lapack (which are probably the main example) where more or less every function requires such an adapted code path?
The most "natural" way to solve this (which after the decision about the processor type needs no time at all) is simply to load the "correct" library.
Sure, if that's what's best for a specific library. More often though, it's only about speeding up certain, core functions. And even where you have alternate insn subsets available, that doesn't affect most of the codebase (since they won't be used even if you enable the compiler switch.)
mv wrote:
Previously this would have required that the library is dynamically loaded, with all the disadvantages like lacking symbol checking at build time. If I understood correctly, the main feature of function multi-versioning is that this is somehow supported now by the linker.
Which means it's dynamically-loaded by the linker. (Yes, I realise it's not dlopen/dlsym, so it's much cleaner for code and for startup such as PLT init.)

I don't have a problem with that (the linker deciding) at all; and I do take your point that in certain instances, we want the linker to vary what library it loads altogether.

I'd like to see some formalisation of the more specific variant, as well as simply building "fat libs" a la MacOS. (across many more pseudo-archs.. gack.)

Additionally, I really wish some linker "innovation" went into sorting out library-resolution properly, instead of the haphazhard "just about works" approach we currently use, which is totally unsuited to production, imo and only really worthwhile for a development machine/jail. [1]

My concern with the "fat pseudo-lib" approach, is that since it can cover the "specific function" use-case, that no-one will bother to develop that, and we will always be using a shotgun when a needle would do it so much better.

The bloat I can see coming is simply unnecessary; and like most design decisions, leaving it till later means much worse downstream effects until it's corrected (which it might never be, from how I see everything else being developed); and the results are the whole point of our work.

It certainly cannot be called "multi-function versioning" even if that is the use-case (as discussed, it can be used to effect that, but it is not what is actually happening here.)
It is "multi-lib" or "multi-libobj" or whatever term the binutils bods want to use for a library object, and exactly the same approach as "fat libs" on MacOS.

So in all honesty, it should just be called "fat libs" if you want to discuss it with people who already deal with this situation, following the prior art. They will instantly know what you mean, and have the correct definition in mind.
If you say "multi-function versioning", chances are people will think you mean what both krinn and I thought was under discussion, when it is not.

Instead the "fat lib" approach is being touted as a mechanism to effect multi-function versioning, which obviously it covers since you're changing the whole library; but it is not the same, and it sure as hell is not elegant, unless you truly have a library where the vast majority of the codebase differs per subarch, as you say blas and lapack do.

And I still concur that needs to be addressed, and am happy for those to use fat libs, iff it makes sense for the specific lib.
The question remains: what about all the others?

Just please, let's not pretend this is true multi-function versioning, or as soon as you leave the Gentoo/Linux bubble, build-system people will wonder where you learnt such bizarrely-twisted definitions; which only reflects badly. (Don't even start me off on weakrefs, and weak symbols and "weak aliases" in GNU land.. just compare and contrast with Levine on "Linkers and Loaders".)

Even where it makes sense to use fat libs, work on the specific will be useful, in assessing the impact, and deciding which approach to take.

--
[1] I see this as very similar to how "backgrounding" daemons are totally unsuited to production, and only conceivably worthwhile when coding or debugging, and thus starting in terminal: when it still isn't a good idea, afaic. (as you're debugging or coding it, so you don't want it hanging around when you're done; use the shell to background it, with /dev/null redirections as you see fit, and leave all that dud code out. "Look Ma, no pidfiles!" ;)
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Sat May 12, 2018 9:08 pm    Post subject: Reply with quote

steveL wrote:
]I do take your point [...] It certainly cannot be called "multi-function versioning"

Neither was it "my" point nor "my" terminology: I just took the terminology from the thread title, and all my explanation was essentially just a summary of how I understood the link in the first posting which was obviously posted to explain what is meant by the title. Perhaps Krinn and you have not read that link?
Quote:
Sure, if that's what's best for a specific library.

As I said, it is probably not accidental that blas and lapack are the main example. I do not know many other non-bundled time-critical cases, but probably there are (e.g. in dev-ros?).
OTOH, that most multimedia programs have their own bundled copy of ffmpeg or libav is much worse IMHO. (E.g. avidemux, mpv, and handbrake all have - or will in the very near future - bundle all libs, and the gentoo developers gave up in unbundling since the maintenance cost in the long run would be too high.) Of course, I do not pretend that the things discussed here are the main reason why upstream insists on bundling, but perhaps they are one of the smaller reasons.
Quote:
The library code does it, same as glibc does/used to do, in an init fn

This would mean that in every function which can be optimized (I suppose for blas and lapack this is almost every function) the library needs a conditional jump, possibly destroying a pipeline. If this is a hot function, the time needed for this can quite sum up. Admittedly, there can be worse things, but if that cost can be avoided altogether...
And, BTW, I doubt that such a thing would ever happen to blas or lapack: These are so ancient and established libraries that nobody dares to seriously touch them. (It is probably not accidental that the only full implementations are still in fortran.)
Quote:
"fat libs"

I think the point of fat libs is that they are built in a single file. With "multiple libs" (which is perhaps a better name) the situation is different: They are just ordinary libs, not requiring any new format. Moreover, you can add support for a new "architecture" by just compiling an additional lib and storing it at the proper place; and you can remove support for an unneeded one and save the harddisk space by just deleting the lib again.
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Sat May 12, 2018 11:58 pm    Post subject: Reply with quote

mv wrote:
Perhaps Krinn and you have not read that link?

I could agree i might not have understood it, maybe i should re-read it, and i will if i feel need to answer again about the subject, but really, can't you just avoid this?
It tocked you what, 3-4 answers before getting the guns out...
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Sun May 13, 2018 5:06 am    Post subject: Reply with quote

krinn wrote:
It tocked you what, 3-4 answers before getting the guns out...

Quite the opposite. I refuse to accept fame or blame for ideas which are not by me. Everything I said in this thread (except the possible gentoo-specific consequences, of course) was already written in that links (or maybe sublinks, I forgot), although perhaps not in a very understandable way. I do not consider it an offense to remind about this to people who entered the discussion later and therefore have perhaps not read the whole thread with all its links and sublinks (IMHO there is nothing bad about that; I do the same sometimes), especially when I realize that suddenly all of these ideas begin falsely to be attributed to me.
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Sun May 13, 2018 1:54 pm    Post subject: Reply with quote

steveL wrote:
I do take your point [...] It certainly cannot be called "multi-function versioning"
mv wrote:

Neither was it "my" point nor "my" terminology: I just took the terminology from the thread title
I was responding to your point about the most "natural" way to solve this, which still reads to me like a point you were making. The terminology I never attributed to you, that I can see.

No, I never read the link; I don't go to external sites if I can avoid it, as I hate all the crappy javascript et al, which can crash a browser, or lead to endless wait, even (or perhaps especially) if you have js disabled.
mv wrote:
This would mean that in every function which can be optimized (I suppose for blas and lapack this is almost every function) the library needs a conditional jump, possibly destroying a pipeline. If this is a hot function, the time needed for this can quite sum up. Admittedly, there can be worse things, but if that cost can be avoided altogether...
Sure it can; that's why I referred to how the kernel uses NOPs which are replaced with traps; no conditionality about it. I don't recall how glibc used to do what it did on x86, it was years ago, but it was similarly unconditional post-init. (and not much use when building from source for native use.)

But, sure, it's much better if the linker can handle it, with nothing needed in libcode, beyond the multiple fn defns (in multi-functional, vs multiple-lib.)
mv wrote:
I think the point of fat libs is that they are built in a single file. With "multiple libs" (which is perhaps a better name) the situation is different: They are just ordinary libs, not requiring any new format. Moreover, you can add support for a new "architecture" by just compiling an additional lib and storing it at the proper place; and you can remove support for an unneeded one and save the harddisk space by just deleting the lib again.
Yes, that's true, I suppose we have to have some benefit for the multiplicity of sub-archs.
It is still exactly the same overall approach as fat libs, just distributed over the filesystem (which fat libs can be too, on installation.)
And in binary distribution, which is the only context where this could make sense, it's all part of the same distball; so again, exactly the same as fat libs from an operator perspective, too.

I agree "multiple libs" is a better term, certainly if you're discussing it with build-system people who work across userlands.

Glad to hear that old FORTRAN code is still kicking about; makes me feel much more hopeful.
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Sun May 13, 2018 2:48 pm    Post subject: Reply with quote

steveL wrote:
that's why I referred to how the kernel uses NOPs which are replaced with traps; no conditionality about it.

Self-modifying code is rather a problem nowadays: Code should be only in readonly segments for security reasons (I do not know how it is with mmap()ed data anyway). In addition, perhaps there are also processor caching issues.
Naturally, the kernel has more freedom in these things when it starts up. But if you remember how glibc does it, I would be interested to learn.
Quote:
And in binary distribution, which is the only context where this could make sense, it's all part of the same distball

Not necessarily. Whether things are packaged in one or separate "packages" and whether just the required ones or all are installed (and even which ones are provided) is completely up to the distribution (and the capabilities of its package manager). Perhaps only the most generic one in /usr/lib is forcefully installed as the fallback, but even that need not be the case. I remember times when SuSE had for many packages splitted i386 i586 and ii686 rpm packages.
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Mon May 14, 2018 9:13 pm    Post subject: Reply with quote

steveL wrote:
that's why I referred to how the kernel uses NOPs which are replaced with traps; no conditionality about it.
mv wrote:
Self-modifying code is rather a problem nowadays: Code should be only in readonly segments for security reasons (I do not know how it is with mmap()ed data anyway). In addition, perhaps there are also processor caching issues.
Naturally, the kernel has more freedom in these things when it starts up. But if you remember how glibc does it, I would be interested to learn.
I don't like SMC either (didn't much like the kernel NOP thing, but it makes sense.)

I can't be sure, but I think it was something to do with PLT/GOT rewrite, in _startup, before main. That's certainly the sanest approach, given that we have all those function pointers all over the gaffe. And yes, marking read-only was/is a concern (which is why it's better if the linking-loader can do it.)

WRT packaging, however you split it (and I don't think package manglers are that sophisticated, across linuxen [1]), you are still doing the same thing as fat libs: providing multiple copies of the library, with runtime/install selection of which one to load.

Selection of functions to utilise at runtime, via the obvious feature bitmap and maximal match, still needs to be an option, for the vast majority of libraries which do not vary that much, to fulfil the binhost use-case. (And then it's much easier to compare and contrast, as one would already need to do with the output on say blas and lapack.)

As stated the concern is about the usual "we'll only use it for this" becoming "oh look it works for that, so therefore it must be the best approach," (no, it's not!) "we've implemented it now, people who moan should have said something before; you can always patch it, it's under the GPL so therefore everything is hunky-dory". NAK (and no more response, as it's a waste of time and headspace.)
Cue decades of wasted hours and bloat, on what everyone knew was the wrong approach from the get-go. The joys of FLOSS.

--
[1] Nor is making them more complex to scratch an itch, a good use of time and effort. eg: multilib on gentoo.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum