Avoiding Recompilation

mv · Watchman Joined: 20 Apr 2005 Posts: 6747

The intention of this thread is to summarize some ideas of a recent discussion of the dev-ml on a related bug so that discussion can take place without spamming that list with details for people not interested in it.
In particular, the topic I want to discuss here is also partially independent from these other discussions.

Description of the problem

Originally, the problem arose in the discussion about the behaviour of portage and other package managers with respect to static (vs. dynamic) dependencies, see e.g. this WikiPage. Roughly speaking, this discussion was about: should portage take the dependency information from /var/db or from the current tree? (Currently the behaviour is mixed which leads to various problems.)

It turned out that it might be a good idea to have a method to "bump" a package or at least to update its data in /var/db and .tbz2 files in a rather controlled way, if possible without recompiling the whole package.
Of course, the recompilation can only be skipped in certain cases if the ebuild maintainer knows exactly why he wants to skip it.
Perhaps one can find a solution which in some cases might be more useful than just for the current "static vs. dynamic deps" problem.

Although originally the whole discussion appeared in the context of the "static vs. dynamic deps" discussion, in the author's opinion the suggestions of 2. below are of independent interest and are useful independent of the outcome of that discussion: It might omit redundant recompilations in both cases.

Suggested solutions

Ignore the problem and live with redundant recompilations. This has some variants
1. If tree policy decides to support static deps fully, the number of unnessary recompilation will probably highly increase.
2. If tree policy decides to go a half-harted static deps, there might soon be problems of packages not really updated although they should.
3. If tree policy decides to remain with dynamic deps, portage has to be fixed, since currently there is in many cases a fallback to static deps. This fix of portage is not trivial, and nobody is volunteering to write it.
Use some mechanism to tell portage that certain bumps can be done without recompilation. This comes also in some variants (explained in more detail below).
1. Use minor revisions
2. Use a new metadata variable
3. Use other variables
4. Invent some other mechanism (special file, entry in metadata.xml, etc.)
Use other mechanisms to update the dependencies. For instance,
1. Update by some pre-defined rules
2. Extend the current pkgmove mechanism dramatically
3. Invent some other mechanism.

Some details/remarks:

3. a means that trivial changes like dependencies foo/bar -> foo/bar:0 are just "copied", for others (e.g. adding of foo/bar:=) one might need more complicated rules; e.g. different behaviour if the adding happens within || ( ... ) or within an "and" dependency; the latter should perhaps require recompilation, etc.
The disadvantage of this method is that it is hard to find rules which are correct in all cases, and so again in some cases unnessary recompilation needs to be forced.

3. b has the disadvantage of being limited to certain "update language" which will probably never be able to include all cases. Moreover, the database will probably permanently grow, and entries cannot easily be removed. Difficulties arise if some change should be "undone" later on (which for pkgmoves is explicitly forbidden).

2. a means that a new version-syntax is introduced which allows "subrevisions", e.g. foo/bar-4711-r1.2. If updating from foo/bar-4711-r1.1 or foo/bar-4711-r1, the package manager is allowed to skip the phases "unpack", "prepare", "configure", "compile", "install", "merge" (incl. "remove") but will act almost as if the package is reinstalled: Instead of actual merging, he will just take the previous /var/db/pkg/foo/bar-4711-r*/CONTENTS file.
Of course, it is completely up to the ebuild maintainer to judge whether this behaviour is really correct in all cases.

2. b means that there is a new variable (whose name might be discussed here) which has e.g. a syntax similar to DEPEND: If upgrading from a version mentioned in this variable, the phases mentioned in 2.a can be skipped by the package manager. In a more complex setting, even USE-flags might be used in that variable which means that the corresponding part of that variable becomes only active (or fails to become active) if the upgrade is from the corresponding USE-Flag.

2. c means that some random variable name is used which is treated similarly as in 2.b. In contrast to 2.a and 2.b this would not require an EAPI-bump of the package, but it appears somewhat hackish.

All variants of 2. have the advantage/disadvantage that they might be used/misused to propagate also other changes (like USE-flags) without recompilation. The latter is reasonable only in the extended variant of 2.b and would need much care of the ebuild maintainer.
All variants of 2. have the disadvantage that mistakes by the ebuild maintainer can be rather severe and cause very subtle problems.

With 2.a there is the problem that subrevisions are already used by "sub"-distributions of gentoo (e.g. by prefix-portage). Also, it would require tools to be updated. (eix can do it, but other tools are probably not yet prepared for such changes).

With 2.b and 2.c there is the problem that the maintainer of the ebuild could easily forget that the variable needs to be updated, and that e.g. repoman cannot know whether an unchanged content of this variable is desired or a mistake. An idea to avoid this problem is to always require a certain change in this variable if it should be kept (e.g. by requiring that the current package version must be written as a first [otherwise ignored] word into this variable, or that the name of that variable should contain the revision). These latter suggestions, however, are probably rather confusing and unelegant.

Edit: Added some explanations/links according to the subsequent suggestion.

Genone · Posted: Mon Aug 04, 2014 12:28 pm Post subject:

If you want a discussion it might help if you define what the "static vs. dynamic dependencies" issue is or at least link to the relevant thread. Because I have no clue what that is about.

Regarding 2), the usual problem with "special" upgrades is always that it is not a property of a single CPV but the property of the relevant upgrade path. So whatever mechanism is used has to specify at least a "last compatible" version. Using minor revisions with special semantics is IMO a bad idea, as sooner or later people will use it for other purposes and then request more special casing (people always get creative when it comes to versioning).

mv · Watchman Joined: 20 Apr 2005 Posts: 6747

Genone · Posted: Tue Aug 05, 2014 7:17 am Post subject:

OMG, that "dynamic-deps" thing must be one of the worst ideas in Gentoo ever (plus the name is totally stupid). If people need a mechanism to "patch" metadata after installation they should come up with just that and not start adding broken semantics to core package manager systems.
So if I understand the original issue correctly (only checked the Wiki page for now) basically all that is needed is to replace vdb metadata with live ebuild metadata without actual remerge? Then a mix of 3b and 3c sounds the most appropriate solution to me ("global updates" have always been a nasty hack and could do with a proper redesign).

I understand you want to discuss the no-recompile proposal decoupled from the deps issue, but to me the whole idea of "updating" a package without changing its payload sounds conceptually wrong, so I'd rather get the underlying issue fixed properly than adding more hacks on top of an already overcomplicated system (well, I would if I had any business with Gentoo still).

mv · Watchman Joined: 20 Apr 2005 Posts: 6747

Genone · Posted: Wed Aug 06, 2014 8:54 am Post subject:

I understood the problem, and with "the underlying issue" I actually meant the "update outdated vardb metadata". Just disagree with adding special hacks for abusing the remerge mechanics to solve it (and totally disagree with using tree metadata for installed packages for that matter) which I assumed you wanted to focus on.

As the existing "global update" operations fall into the same category it would seem smart to integrate this, but as you can't replace pkgmoves with standard ebuild operations (neither "dynamic deps" nor "metadata-only remerge") the obvious suggestion for me would be to come up with a system to apply such tree changes to vardb in a consistent and transparent way. Now I don't have put much thought into that subject yet and I know it's far from trivial, so for now I'm just pointing out what I consider problems in the proposals so far.

mv · Watchman Joined: 20 Apr 2005 Posts: 6747

Genone · Posted: Wed Aug 06, 2014 10:23 am Post subject:

As said, I haven't put a lot of thought into it yet. Lets get to some common ground first to isolate the specific requirements before dabbling into solutions. What is needed is a mechanism to
a) replace a vardb ebuild (and derived metadata files) with its updated tree counterpart
b) potentially propagate the results to reverse dependencies (ideally not necessary as maintainers will update all affected ebuilds, and could cause problems)
c) apply further tree modifications (renames specifically) to the vardb state

Currently, ignoring that idiotic dynamic-deps behavior, remerges (including recompile) deal with a) and pkgmove instructions in global updates with c). slotmoves (do those still exist?) belong to group a) and b).

Can we agree on that so far?

mv · Watchman Joined: 20 Apr 2005 Posts: 6747

steveL · Posted: Wed Sep 10, 2014 2:19 am Post subject:

Hey mv, sorry to post so late, I've been quite busy and only caught up on ML a bit today, when I saw your response in mutt.

mv · Watchman Joined: 20 Apr 2005 Posts: 6747

steveL · Posted: Thu Sep 11, 2014 5:00 pm Post subject:

mv · Watchman Joined: 20 Apr 2005 Posts: 6747

steveL · Posted: Fri Sep 12, 2014 10:46 am Post subject:

steveL · Posted: Fri Sep 12, 2014 4:22 pm Post subject:

This patch looks like it will make dynamic-deps a lot smarter, in that they will take into installed packages (info on the user-machine) into account.

mv · Watchman Joined: 20 Apr 2005 Posts: 6747