Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
proposal: a better than rpmfind solution for gentoo
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
Klavs
Guru
Guru


Joined: 22 May 2002
Posts: 536
Location: Denmark

PostPosted: Wed Jul 10, 2002 7:21 am    Post subject: proposal: a better than rpmfind solution for gentoo Reply with quote

The other day, I didn't remember where I would get a certain file from, and for gentoo, I know of no way to find that out, except searching rpmfind and google, and see what comes up.

My proposal, is that we build a repository - 1 file, for each ebuild file, which should atleast contain every filename that this package can build (which will be dependant on use-settings). Instead of the developers doing the hard work, this could be added to portage - so people could contribute to gentoo - by just enabling uploads of file-lists (incl. ebuild versions, and use settings) to gentoo - that way we would quickly build up a central database, where people could search - just like rpmfind - and also it wouldn't be hard to let people download it just like a package to be able to search locally.

What do you guys think? am I the only guy who needs this - and don't like to rely on rpmfind and google, as they don't always find what I need.. f.ex. I can't find epm there.. (except for posts on gentoo-forum stating this particular file, but there are others which aren't easy to locate, I'm sure.).
_________________
Best regards,

Klavs Klavsen
Denmark

Working with Unix is like wrestling a worthy opponent.
Working with windows is like attacking a small whining child
who is carrying a .38.
Back to top
View user's profile Send private message
klieber
Bodhisattva
Bodhisattva


Joined: 17 Apr 2002
Posts: 3657
Location: San Francisco, CA

PostPosted: Wed Jul 10, 2002 12:16 pm    Post subject: Reply with quote

moving to Gentoo Suggestions.

--kurt
_________________
The problem with political jokes is that they get elected
Back to top
View user's profile Send private message
klieber
Bodhisattva
Bodhisattva


Joined: 17 Apr 2002
Posts: 3657
Location: San Francisco, CA

PostPosted: Wed Jul 10, 2002 12:22 pm    Post subject: Re: proposal: a better than rpmfind solution for gentoo Reply with quote

Klavs wrote:
What do you guys think? am I the only guy who needs this - and don't like to rely on rpmfind and google, as they don't always find what I need

I don't think I understand your suggestion. Why do you need to find *any* files? You should be able to get them simply by doing an 'emerge <filename>' (or 'emerge -f <filename>' if you only want the package)


Klavs wrote:
f.ex. I can't find epm there


Code:
kurtl@gentoo kurtl $ emerge -s epm
[ Results for search key : epm ]
[ Applications found : 2 ]

*  net-mail/grepmail
      Latest version Available: 4.70
      Latest version Installed: [ Not Installed ]
      Homepage: http://grepmail.sourceforge.net/
      Description: Search normal or compressed mailbox using a regular
      expression or dates.

*  sys-apps/epm
      Latest version Available: 0.7
      Latest version Installed: 0.7
      Homepage: http://www.gentoo.org/~agriffis/epm/
      Description:
      rpm workalike for Gentoo Linux


Looks like the homepage for epm is http://www.gentoo.org/~agriffis/epm/

What am I not understanding about your suggestion?

--kurt
_________________
The problem with political jokes is that they get elected
Back to top
View user's profile Send private message
Klavs
Guru
Guru


Joined: 22 May 2002
Posts: 536
Location: Denmark

PostPosted: Wed Jul 10, 2002 2:50 pm    Post subject: Reply with quote

It doesn't work the way you describe for many files, I'm sorry to say.

I wanted that exact feature you meantion, but I tried:

amd root # emerge -s mcopy -p
[ Results for search key : mcopy ]
[ Applications found : 0 ]

even though mtools is even installed..

As you can see it does't work for that many files, of the ones in a package.
I often get some script, where they guy uses some obscure utility (especially for sgml stuff) and I have some times had my handsfull trying to figure out which package would get me that file..

This would work, if the system I suggest was implemented - as every file from any given package would be known to the system - which as you can see from my example with mcopy is NOT the case now.
_________________
Best regards,

Klavs Klavsen
Denmark

Working with Unix is like wrestling a worthy opponent.
Working with windows is like attacking a small whining child
who is carrying a .38.
Back to top
View user's profile Send private message
klieber
Bodhisattva
Bodhisattva


Joined: 17 Apr 2002
Posts: 3657
Location: San Francisco, CA

PostPosted: Wed Jul 10, 2002 3:28 pm    Post subject: Reply with quote

OK, I understand -- you mean for tools buried inside of other packages. That makes more sense and I agree that it's a real pain in the arse some times. (trying to figure out what package holds kmail and knode, for example)

You can sort of accomplish what you're trying to do with epm, but that only works for packages that are already installed on your system -- not much help when you're trying to install a specific tool...

I do, however, think that this responsibility falls upon the person who creates the ebuild to fulfill, perhaps by adding a new variable to each ebuild. (such as "PROVIDES" or something similar)

--kurt
_________________
The problem with political jokes is that they get elected
Back to top
View user's profile Send private message
Klavs
Guru
Guru


Joined: 22 May 2002
Posts: 536
Location: Denmark

PostPosted: Wed Jul 10, 2002 4:10 pm    Post subject: Reply with quote

I'm glad you like the idea, and see a need for it too :-)

I must however, disagree that that responsibility should be on the ebuild creator - cause that always means some won't do it - and besides I don't see why they should do extra, when it can be fully automated so no one, would have to do anything.. (that's the best way to solve things in my opinion).

Besides all the information to automate it, is already there in the local emerge db, so if we just merged the file-list from all, then we would know which file this ebuild CAN produce (this will be dependant on USE settings and that's why I wanted to merge the file-list's from a lot of people - so we can cover every possible USE setting).
_________________
Best regards,

Klavs Klavsen
Denmark

Working with Unix is like wrestling a worthy opponent.
Working with windows is like attacking a small whining child
who is carrying a .38.
Back to top
View user's profile Send private message
klieber
Bodhisattva
Bodhisattva


Joined: 17 Apr 2002
Posts: 3657
Location: San Francisco, CA

PostPosted: Wed Jul 10, 2002 5:23 pm    Post subject: Reply with quote

Klavs wrote:
I must however, disagree that that responsibility should be on the ebuild creator

Huh? They're the ones creating the ebuild -- they should know what that ebuild will install on a person's machine.

Using kdenetwork as an example, the ebuild maintainer would have to fill out one section with a list of apps that kdenetwork provides. So, the line would contain, "kmail, knode.." and so on. Then, an 'emerge -s kmail' would correctly point to kdenetwork as the package.

I don't see this as an overly burdensome task. Building up a fully-automated reporting database that accepts incoming connections from anyone running a gentoo box and is smart enough to filter out dupes, handle packages being moved to new locations gracefully and never make a mistake will take just a bit more effort and create a lot more overhead.

I don't see the value there.

--kurt
_________________
The problem with political jokes is that they get elected
Back to top
View user's profile Send private message
delta407
Bodhisattva
Bodhisattva


Joined: 23 Apr 2002
Posts: 2876
Location: Chicago, IL

PostPosted: Wed Jul 10, 2002 5:35 pm    Post subject: Reply with quote

It wouldn't be too hard to do, you know. Portage can be told to automatically generate digest files (handy for ebuild maintainers), so why not set up an option that generates indexes of executables? ("find -perm +100" of the sandbox, for instance.) It would not be hard to make Portage search it, either.
_________________
I don't believe in witty sigs.
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Wed Jul 10, 2002 5:37 pm    Post subject: Reply with quote

klieber wrote:
Klavs wrote:
I must however, disagree that that responsibility should be on the ebuild creator

Huh? They're the ones creating the ebuild -- they should know what that ebuild will install on a person's machine.

Using kdenetwork as an example, the ebuild maintainer would have to fill out one section with a list of apps that kdenetwork provides. So, the line would contain, "kmail, knode.." and so on. Then, an 'emerge -s kmail' would correctly point to kdenetwork as the package.

I don't see this as an overly burdensome task.

I think it probably is, when you start talking about every file installed by the package. One situation in which I would find the tool being discussed useful is in trying to find which Gentoo ebuild provides particular header files needed to compile software - in this case the ebuild maintainer would have to make (and maintain) lots of tedious entries for files in /usr/include.

Agreed it would not be trivial to implement this, but I do see the value, and I can think of at least one other way in which it could probably be reused if it did exist - a repository matching various Gentoo users' kernel configuration files with some hardware information gathered from /proc. That has the potential to drastically reduce kernel configuration related posts.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
delta407
Bodhisattva
Bodhisattva


Joined: 23 Apr 2002
Posts: 2876
Location: Chicago, IL

PostPosted: Wed Jul 10, 2002 5:54 pm    Post subject: Reply with quote

It wouldn't be that hard to implement, as long as ebuild maintainers use the sandbox (they had darn well better anyway). Heck, just "cd /var/tmp/portage/mypackage/version/; find > index" and you have all the files that said ebuild provides in index, ready to be copied to the respective /usr/portage/category/package/files/ directory.

Again, not too hard at all. The only problem would be search speed...
_________________
I don't believe in witty sigs.
Back to top
View user's profile Send private message
klieber
Bodhisattva
Bodhisattva


Joined: 17 Apr 2002
Posts: 3657
Location: San Francisco, CA

PostPosted: Wed Jul 10, 2002 5:57 pm    Post subject: Reply with quote

rac wrote:
I think it probably is, when you start talking about every file installed by the package.

How many packages out there install hundreds of different tools? How big of a deal is it for someone to type out a few different tools, even if a certain package installs 20 different ones? Here -- I'll do it right now:

Code:
kmail, knode, kicker, klipper, cp, ls, cat, less, more, grep, sed, awk, vim, startx, kdm, gdm, xdm, ip, ps, top.


There, that took about 30 seconds, most of which was spent counting to make sure I got 20. Put that in a relational database and you've got a fast, efficient way to query for a particular file.

Now, tell me how you would design a system that does the following:

  • accepts input from hundreds of different clients simultaneously (eventually thousands)
  • weeds out duplicates
  • figures out what version of Gentoo the person is running and...
  • ...uses that to figure out what structure their portage tree should have (since the portage tree structure changes fairly frequently)
  • is able to discern between "real" data and some moron playing a prank.
  • can handle hundreds, if not thousands of simultaneous queries (while still handling hundreds of simultaneous uploads)
  • about 63,000 other "what ifs" that I'm neglecting to list here.


Oh, and then find someone with the hardware to host it all for you.

Sure, the second one is probably doable -- rpmfind works, after all. However, I think it's a poor use of developer time and resources when simply adding the information to the ebuild is so much easier.

--kurt
_________________
The problem with political jokes is that they get elected
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Wed Jul 10, 2002 6:22 pm    Post subject: Reply with quote

klieber wrote:
rac wrote:
I think it probably is, when you start talking about every file installed by the package.

How many packages out there install hundreds of different tools?

Well, I'm not sure exactly what you mean by "tool", but I tried to give an explicit example of what I meant by every file, and it included more than just executables. I have 3454 files in /usr/include on the host I am using to type this message, and I would suspect that that's probably below average.

Quote:
Now, tell me how you would design a system that does the following:

I did say that I acknowledged it would not be trivial to implement.

Quote:
[*]weeds out duplicates

I'm not sure duplicates are that much of a problem. The goal is to map from an absolute file path to a Gentoo package name. If 4,000 people tell the database that /bin/cp belongs to sys-apps/fileutils, it's no big deal. There's still only one record for /bin/cp. If, on the other hand, there is a conflict over which package provides a file between two users' machines, I would think that would be something that the developers would want to know about, because it would cause problems if both packages were installed.

Quote:
[*]figures out what version of Gentoo the person is running and...
[*]...uses that to figure out what structure their portage tree should have (since the portage tree structure changes fairly frequently)

I'm sure I'm missing something obvious, but I don't see why the Gentoo version or the portage tree structure matters. All I would want to upload is the fact that /bin/cp is provided by sys-apps/fileutils-4.1.8-r2, and the system might not even care about the version of the ebuild.

As I understand it, the goal is to provide a list of possible packages that might provide a file, so false positives are OK. A human would be looking at the list, and is probably just needing a little hint as to how to proceed. I'm thinking of CDDB, in which you occasionally get multiple entries for the same CD, which is not a crippling thing.

Quote:
[*]is able to discern between "real" data and some moron playing a prank.

While it wouldn't work against hardcore malicious people, I think your garden-variety moron prank would be avoided by having a private key embedded in the reporting application, having it digitally sign its uploads, and having the repository reject unsigned or improperly signed submissions.

Quote:
[*]can handle hundreds, if not thousands of simultaneous queries (while still handling hundreds of simultaneous uploads)

Separate the hardware handling uploads from that handling queries, and batch-update the database used by the query handler periodically during times of relatively light use.

Quote:
However, I think it's a poor use of developer time and resources when simply adding the information to the ebuild is so much easier.

If we were only talking about core executables, I might be inclined to agree with you. Factor in shared libraries, include files, documentation files and the like, though, and I think it looks completely different.

I would instead suggest that it would be a poor use of developer time and resources asking ebuild maintainers to maintain the file list information for each of these packages; this effort could be spread across the users instead.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
delta407
Bodhisattva
Bodhisattva


Joined: 23 Apr 2002
Posts: 2876
Location: Chicago, IL

PostPosted: Wed Jul 10, 2002 6:27 pm    Post subject: Reply with quote

Hey, guys! You don't need dedicated hardware, and you don't need new tools. A "what provides this file" system can be made -- easily, I might add, compared to the other options -- by a mechanism similar to how digest files are made. I would guess the code necessary to come up with a file index would be under 50 lines of Python; it's just a matter of coding them and applying it to each ebuild. Again, we already have a digest file for each package, why not an index file too?

This would eliminate the security concerns (as they would be generated by the package maintainer), eliminate the hardware requirements (as searching would be done on the end-user's computer), and make life better overall.
_________________
I don't believe in witty sigs.
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Wed Jul 10, 2002 6:31 pm    Post subject: Reply with quote

delta407 wrote:
Hey, guys! You don't need dedicated hardware, and you don't need new tools. A "what provides this file" system can be made -- easily, I might add, compared to the other options -- by a mechanism similar to how digest files are made.

For digest files, everybody's using the same source tarball, so my answer is the same as your answer. For "what provides this file", that's not necessarily the case. You and I might have different architectures, or different USE flags, and aren't there cases where that would change which files are installed by a package?
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
delta407
Bodhisattva
Bodhisattva


Joined: 23 Apr 2002
Posts: 2876
Location: Chicago, IL

PostPosted: Wed Jul 10, 2002 6:33 pm    Post subject: Reply with quote

It is rare that USE flags dictate the files produced. USE flags (usually) specify functionality to ./configure, which produces the same set of files (although the files are different, of course). And again, since the package maintainer would be making the index files, if their package for whatever reason had files that depended on the USE flags they could set their USE flags to generate all the possible files, or hack the index file by hand.
_________________
I don't believe in witty sigs.
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Wed Jul 10, 2002 6:42 pm    Post subject: Reply with quote

delta407 wrote:
It is rare that USE flags dictate the files produced.

Rare, true enough. I guess it's a matter of opinion as to how important this is - I can hear poster A asking why their Vietnamese doesn't work in XEmacs any more, and poster B says to check /usr/lib/xemacs/mule-packages/lisp/mule-base/viet-util.el, poster A asks which package they should emerge to get that mule stuff, and it turns out that the answer was that USE mule had to be defined when XEmacs was compiled. A contrived example, I admit.

There is a saying in Japanese that I find applicable to software development that translates roughly to "things that don't happen very often still happen".

Quote:
And again, since the package maintainer would be making the index files, if their package for whatever reason had files that depended on the USE flags they could set their USE flags to generate all the possible files, or hack the index file by hand.

I guess our philosophical difference here is that I was assuming that minimizing the burden this system would place on ebuild maintainers (preferably to zero) would improve the chances of adoption. Your and klieber's responses are causing me to question this assumption. :)
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
Spark
Tux's lil' helper
Tux's lil' helper


Joined: 30 Jun 2002
Posts: 87

PostPosted: Wed Jul 10, 2002 7:40 pm    Post subject: Reply with quote

To me delta's suggestion sounds a whole lot more reasonable. =) Sure it would miss some very rare occasions (probably) but at least you would have those file indexes for every package _and_ it would be a lot less work.

BTW, what is epm good for? =)
Back to top
View user's profile Send private message
Naan Yaar
Bodhisattva
Bodhisattva


Joined: 27 Jun 2002
Posts: 1549

PostPosted: Wed Jul 10, 2002 8:07 pm    Post subject: Reply with quote

We already generate the file index and put it in the CONTENTS file in /var/db... So, in essence, this information is already there at the build site.

delta407 wrote:
Hey, guys! You don't need dedicated hardware, and you don't need new tools. A "what provides this file" system can be made -- easily, I might add, compared to the other options -- by a mechanism similar to how digest files are made. I would guess the code necessary to come up with a file index would be under 50 lines of Python; it's just a matter of coding them and applying it to each ebuild. Again, we already have a digest file for each package, why not an index file too?

This would eliminate the security concerns (as they would be generated by the package maintainer), eliminate the hardware requirements (as searching would be done on the end-user's computer), and make life better overall.
Back to top
View user's profile Send private message
iplayfast
l33t
l33t


Joined: 08 Jul 2002
Posts: 642
Location: Cambridge On,CA

PostPosted: Thu Jul 11, 2002 2:05 pm    Post subject: Reply with quote

[quote="klieber"]
rac wrote:
How big of a deal is it for someone to type out a few different tools, even if a certain package installs 20 different ones? Here -- I'll do it right now:

Code:
kmail, knode, kicker, klipper, cp, ls, cat, less, more, grep, sed, awk, vim, startx, kdm, gdm, xdm, ip, ps, top.

Code:
<snip>


--kurt


But what if you've missed one, that is used by one of the tools listed. eg vim, is it build on top of vi? (I don't know just supposing).
Back to top
View user's profile Send private message
klieber
Bodhisattva
Bodhisattva


Joined: 17 Apr 2002
Posts: 3657
Location: San Francisco, CA

PostPosted: Thu Jul 11, 2002 2:09 pm    Post subject: Reply with quote

iplayfast wrote:
But what if you've missed one, that is used by one of the tools listed. eg vim, is it build on top of vi? (I don't know just supposing).


Then a bug report gets filed on bugs.gentoo.org

--kurt
_________________
The problem with political jokes is that they get elected
Back to top
View user's profile Send private message
delta407
Bodhisattva
Bodhisattva


Joined: 23 Apr 2002
Posts: 2876
Location: Chicago, IL

PostPosted: Thu Jul 11, 2002 2:55 pm    Post subject: Reply with quote

Naan Yaar wrote:
We already generate the file index and put it in the CONTENTS file in /var/db... So, in essence, this information is already there at the build site.


Yeah, but it's not in the Portage tree, so it works only for packages that are already merged. Plus, it would be nice to get an index of executables separate from the rest of the files. Hmm...
_________________
I don't believe in witty sigs.
Back to top
View user's profile Send private message
Naan Yaar
Bodhisattva
Bodhisattva


Joined: 27 Jun 2002
Posts: 1549

PostPosted: Thu Jul 11, 2002 4:22 pm    Post subject: Reply with quote

Yeah, I do know that :) What I was alluding to is "at least" dumping CONTENTS files from gentoo test build boxes to a web site, even if we did not really want to put this in portage. Of course, if we make this part of portage, developers can populate whatever file is designated for this purpose from a build (using find) or a merge (from CONTENTS).

We could create indices for executables, libraries, doc., etc., based on a master file list generated as described above and the directory into which things are installed as described in the file list.

delta407 wrote:
...Yeah, but it's not in the Portage tree, so it works only for packages that are already merged. Plus, it would be nice to get an index of executables separate from the rest of the files. Hmm...
Back to top
View user's profile Send private message
Klavs
Guru
Guru


Joined: 22 May 2002
Posts: 536
Location: Denmark

PostPosted: Thu Jul 11, 2002 7:21 pm    Post subject: Reply with quote

As the guy who started this discussion, I'm pleased to see as much interest for the idea - which something I would very much like to see made possible, preferrably implemented in portage - automatization is good :-)

Perhaps we should agree on the different solutions people prefer and make a poll and see the results?
_________________
Best regards,

Klavs Klavsen
Denmark

Working with Unix is like wrestling a worthy opponent.
Working with windows is like attacking a small whining child
who is carrying a .38.
Back to top
View user's profile Send private message
delta407
Bodhisattva
Bodhisattva


Joined: 23 Apr 2002
Posts: 2876
Location: Chicago, IL

PostPosted: Thu Jul 11, 2002 7:27 pm    Post subject: Reply with quote

While klieber's suggestion (manually specifying all the executables) is practical, it is limited (only executables, not libraries, include files, etc.) and requires more work -- the latter being the main reason no one will want to go for it. ;)

Since Portage has sandboxing, it is trivial to generate an index of the files a package supplies. Most of the time, they will be the same, regardless of USE settings. But, for the times that they are not, we could use a dependency-like syntax (i.e. "variable? (/path/to/optional/file)" or something like that). That would make it automatic for 90-some percent of the ebuilds, and relatively easy for the minority.

I don't see much to vote on, other than who files a Portage feature request with Bugzilla. :D
_________________
I don't believe in witty sigs.
Back to top
View user's profile Send private message
Klavs
Guru
Guru


Joined: 22 May 2002
Posts: 536
Location: Denmark

PostPosted: Thu Jul 11, 2002 7:35 pm    Post subject: Reply with quote

I will be pleased to do that :-)
_________________
Best regards,

Klavs Klavsen
Denmark

Working with Unix is like wrestling a worthy opponent.
Working with windows is like attacking a small whining child
who is carrying a .38.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum