Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Question regarding creating ebuilds
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
lyallp
Veteran
Veteran


Joined: 15 Jul 2004
Posts: 1552
Location: Adelaide/Australia

PostPosted: Sat Feb 23, 2013 7:20 am    Post subject: Question regarding creating ebuilds Reply with quote

It's possibly a silly question, but I was wondering if there is a tool or framework that allows someone to 'build' a program from source and the tool identifies all of the software/libraries that where used during the build, then lists the packages that provide those tools, so that an ebuild can be made that fully lists all the dependencies of the build.

I guess it would be something within the sandbox to track files used during the build then generate a summarised list of required packages that are required to be installed simply to build.
_________________
...Lyall
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21431

PostPosted: Sat Feb 23, 2013 4:30 pm    Post subject: Reply with quote

I am not aware of such a tool. It would be useful not only for the purpose you describe, but also to validate that an existing ebuild has a correct dependency list.

There are several parts to solving this. The simplest piece, which could be written quickly by someone with the relevant background, would be a check that no automagic dependencies were added. It would scan the installed files to check that every library they use is, either directly or indirectly, provided by a package in the RDEPEND list. However, this is insufficient. First, it fails to validate the DEPEND list in any way. Second, it completely ignores any runtime dependencies that cannot be determined from static inspection of the installed files.
Back to top
View user's profile Send private message
lyallp
Veteran
Veteran


Joined: 15 Jul 2004
Posts: 1552
Location: Adelaide/Australia

PostPosted: Sun Feb 24, 2013 1:03 am    Post subject: Reply with quote

I was thinking something a little more comprehensive.

strace (using '-y -e trace=open' plus other appropriate arguments) on the sandbox of the build.
Record every file hit outside the source/build directory (gcc, make, libraries, etc), excluding key temporary locations such as /tmp and /var/log.
Remove duplicate file references to attempt to speed things up.
Associate each file with a package. (this would be the time consuming part, unless portage constructs a more efficient database of file to package correspondence).
Print the list of packages.

Obviously this would tend to indicate the versions of packages that are currently installed but the developer may be able to find tune that somewhat so as to not constrain the ebuild to a very narrow band of package versions.

This way, you end up with a complete list of dependencies, including build dependencies, not just runtime.

Just a thought.

:)
_________________
...Lyall
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21431

PostPosted: Sun Feb 24, 2013 5:05 am    Post subject: Reply with quote

Yes, that is one way to do it. However, that is a fairly expensive approach, and it requires some clever rules in the post-processing phase to avoid scooping up a package solely because the scooped package is used by a tool that happens to be invoked in the build process. For example, should the ebuild gain a build-time dependency on ncurses because it invoked bc to perform big number math? For me, bc has a load time dependency on ncurses, so tracing bc will show it accessing the ncurses library.
Back to top
View user's profile Send private message
lyallp
Veteran
Veteran


Joined: 15 Jul 2004
Posts: 1552
Location: Adelaide/Australia

PostPosted: Sun Feb 24, 2013 6:54 am    Post subject: Reply with quote

With regards to expense, it's only during ebuild construction, not during normal installation.

So, maybe only monitoring 'forks' and the executables that are run, ignoring the shared libraries used by those .

For example, just following 'strace -e trace=process' looking for 'execve()' calls and only generating the list of packages for those execve() that fall outside of the emerge source build tree.

It wouldn't be foolproof as there may be some files (fonts? images?) which won't get caught in this manner, but, it would be better than having to do it manually.

Still, I find this an interesting thought exercise. :)
_________________
...Lyall
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21431

PostPosted: Sun Feb 24, 2013 5:44 pm    Post subject: Reply with quote

If you do not inspect open, then you will miss accesses to data files, headers, and static libraries. If you do inspect open, you must use special rules to avoid depending on shared libraries that were opened by the dynamic loader.
Back to top
View user's profile Send private message
lyallp
Veteran
Veteran


Joined: 15 Jul 2004
Posts: 1552
Location: Adelaide/Australia

PostPosted: Mon Feb 25, 2013 8:53 am    Post subject: Reply with quote

If I do not inspect open, you are correct, I won't see shared libraries and such, used by the build tools.

But, if I monitor executables, those executables will be part of their own ebuild and have their own dependencies, which I don't care about, so long as I say 'requires bc', who cares if 'bc' requires ncurses, that would be part of the requirements for 'bc' ebuild.

As a post step of the ebuild compilation, we could then do an 'ldd' on the package installed components to identify the shared libraries that the built component uses, as opposed to the shared libraries used by the build tools.

Those library ebuilds would have their own dependencies which the current ebuild, that is under construction, does not care about.

Thus, we don't need to monitor 'open' because we would have caught the executables used in the build process and identified the used shared libraries used after build. The only thing we wouldn't catch is static libraries. No great loss in the grand scheme of things, compared to manual construction of the dependencies, which would be more error prone.

It gets easier as we discuss it more :)
_________________
...Lyall
Back to top
View user's profile Send private message
Genone
Retired Dev
Retired Dev


Joined: 14 Mar 2003
Posts: 9501
Location: beyond the rim

PostPosted: Mon Feb 25, 2013 12:02 pm    Post subject: Reply with quote

lyallp wrote:
But, if I monitor executables, those executables will be part of their own ebuild and have their own dependencies, which I don't care about, so long as I say 'requires bc', who cares if 'bc' requires ncurses, that would be part of the requirements for 'bc' ebuild.

As a post step of the ebuild compilation, we could then do an 'ldd' on the package installed components to identify the shared libraries that the built component uses, as opposed to the shared libraries used by the build tools.

Unfortunately this isn't just black and white, there is a lot of grey too (think conditional dependencies, USE dependencies, ...). Also ldd looks nice on the surface for tracking library dependencies, but for example it will miss all libraries used by dlopen and relatives. There used to be a FEATURE=verify-rdepends (I think that was the name) to do something like this a few years ago, it was removed due to complexity and false results. And don't forget dependencies on stuff like header- or data-files (e.g. xproto- packages) or any other non-trivial cases. Long story short, the only thing you would catch are the directly linked dynamic libraries and executables, which can just as easy be obtained manually and is rarely a problem, while missing all the "interesting" (those that cause tricky problems) dependencies.
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21431

PostPosted: Tue Feb 26, 2013 3:18 am    Post subject: Reply with quote

My point with regard to bc is that you want to avoid adding a dependency on ncurses, but if you monitor library loads, then you need a special rule to throw out shared libraries brought in by the dynamic loader resolving the dependencies of the executed program. You cannot ignore all shared libraries, because some of them may be optional features loaded at runtime due to the way the monitored package invoked the helper program.

By ignoring open, you also miss headers. Remember that for some C++ programs, a tremendous amount of power can be delivered in a header-only project. Some of the Boost libraries fall into this category. By failing to detect the use of those headers, you omit the entire package from the dependency list.

Also, for scripting languages, you miss most or all of the dependencies. Suppose the build system runs a Python script that is shipped with the package being monitored, but that script uses import to load non-core Python modules (e.g. numpy) for extra functionality. By monitoring exec, you can tell that the build system ran Python, but you cannot tell what modules were loaded by that Python instance.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum