Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Discussion & Documentation Gentoo Forums Feedback
  • Search

Gentoo forum search sucks

Having a problem with the forums? Have a suggestion? Post here!
Post Reply
  • Print view
Advanced search
346 posts
  • Page 1 of 14
    • Jump to page:
  • 1
  • 2
  • 3
  • 4
  • 5
  • …
  • 14
  • Next
Author
Message
madpenguin8
n00b
n00b
User avatar
Posts: 38
Joined: Wed Jun 19, 2002 6:07 pm
Location: Detroit MI

Gentoo forum search sucks

  • Quote

Post by madpenguin8 » Sun Feb 02, 2003 9:15 pm

Anyone else think that the gentoo forum search sucks. It comes up with just about nothing relevant to my search. It seems like really popular posts that have absolutely nothing to do with what I am searching for always turn up close to the top. Writing up a dupe sucks, but what can a guy do if search turns up garbage.
Have you ever stopped to think...............and never start again?
Top
rac
Bodhisattva
Bodhisattva
User avatar
Posts: 6553
Joined: Thu May 30, 2002 6:19 am
Location: Japanifornia
Contact:
Contact rac
Website

  • Quote

Post by rac » Sun Feb 02, 2003 9:20 pm

Some tips: make sure your search terms are all 3+ letters long. Join all your terms with 'and' or check "search for all terms", otherwise the default is "or", which is probably not what you want. Use many anded search terms, and widen the search later. Try to choose obscure words if you can: unusual words that appear in an exact error message are usually very good. Things like "bug", "portage", "broken" are bad.

Do you have any recommendations for improving the search engine while still being able to upgrade phpBB in a timely fashion?


EDIT: To consolidate some search threads, I'm including the following quote from another thread. -- pjp
rac wrote:Here's how the phpBB search function works. Each post is split into words. First, some characters are replaced. There are three classes of characters here, those that get replaced by spaces, those that get elided, and those that get left alone. Next, whitespace is used to delineate words. All words of less than 3 or more than 20 characters are dropped. Then an entry is made in the dictionary table for every word that is not in the dictionary, so that it can be referenced by number. An entry is made in a colossal table for each and every word in each and every post. That's what gets searched against.

To get back to Reformist's two examples, gnome2 is a word. 'gnome 2' is two words, one of which is impossible to match because it is one character long. "1.1.0" is three words, each of which are impossible to match, because they are one character long. One modification that it might be feasible to make would be to change the status of '.'. If it were left alone, version numbers would become searchable. However, words at the end of sentences, followed by periods, would become unsearchable, because a separate entry would be made including the period. If it were elided, the end-of-sentence problem would go away, but then you would have to search for "abiword and 110", and "2.1" would become "21" and fall under the three-character limit.
For every higher wall, there is a taller ladder
Top
PARENA
Guru
Guru
User avatar
Posts: 349
Joined: Mon Jan 06, 2003 4:49 pm

  • Quote

Post by PARENA » Sun Feb 02, 2003 9:55 pm

Here's an example of a (what I think) good search. Let's say you want to know about using the slit in Openbox:

openbox AND slit

(almost) guaranteed to come up with your answer. Unless it's not on the forum of course. :)
Top
idl
Retired Dev
Retired Dev
User avatar
Posts: 1728
Joined: Tue Dec 24, 2002 8:02 pm
Location: Nottingham, UK

  • Quote

Post by idl » Sun Feb 02, 2003 10:42 pm

would be cool if we could do the old "kde3.1 emerge failure" note the "" that makes sure it matches the whole string. like on google :)
Top
rac
Bodhisattva
Bodhisattva
User avatar
Posts: 6553
Joined: Thu May 30, 2002 6:19 am
Location: Japanifornia
Contact:
Contact rac
Website

  • Quote

Post by rac » Sun Feb 02, 2003 11:50 pm

port001 wrote:would be cool if we could do the old "kde3.1 emerge failure" note the "" that makes sure it matches the whole string. like on google :)
That, while a nice feature, is completely impossible with the current way the search databases work, because the search match tables have fields for word number and post id only. There is no sense of what words occur next to one another.
For every higher wall, there is a taller ladder
Top
Matje
l33t
l33t
Posts: 619
Joined: Tue Oct 29, 2002 11:24 pm
Location: Hasselt, Belgium

  • Quote

Post by Matje » Mon Feb 03, 2003 12:35 am

rac wrote:
port001 wrote:would be cool if we could do the old "kde3.1 emerge failure" note the "" that makes sure it matches the whole string. like on google :)
That, while a nice feature, is completely impossible with the current way the search databases work, because the search match tables have fields for word number and post id only. There is no sense of what words occur next to one another.
It _is_ possible, by making php look into the actual post texts. Would be a query like:
SELECT post_id FROM phpbb_posts_text WHERE post_text LIKE '%$searchstring%'
But this'll butcher the mysql I think with these number of posts on the forum :-)
Life is like a box of chocolates... Before you know it, it's empty...
Top
xlyz
Veteran
Veteran
User avatar
Posts: 1470
Joined: Sun Oct 27, 2002 8:04 pm
Location: Italy

  • Quote

Post by xlyz » Mon Feb 03, 2003 2:08 am

rac wrote: Join all your terms with 'and' or check "search for all terms", otherwise the default is "or", which is probably not what you want.
please make "AND" default. "OR" is seldom used
Top
gsfgf
Veteran
Veteran
User avatar
Posts: 1266
Joined: Wed May 08, 2002 3:24 pm

  • Quote

Post by gsfgf » Mon Feb 03, 2003 3:01 am

xlyz wrote:
rac wrote: Join all your terms with 'and' or check "search for all terms", otherwise the default is "or", which is probably not what you want.
please make "AND" default. "OR" is seldom used
at least for the quick search.
Top
pilla
Bodhisattva
Bodhisattva
User avatar
Posts: 7732
Joined: Wed Aug 07, 2002 8:19 pm
Location: Underworld

  • Quote

Post by pilla » Mon Feb 03, 2003 3:07 am

Moving to Gentoo Forum Feedback.
"I'm just very selective about the reality I choose to accept." -- Calvin
Top
Matje
l33t
l33t
Posts: 619
Joined: Tue Oct 29, 2002 11:24 pm
Location: Hasselt, Belgium

  • Quote

Post by Matje » Mon Feb 03, 2003 10:45 am

gsfgf wrote:
xlyz wrote:
rac wrote: Join all your terms with 'and' or check "search for all terms", otherwise the default is "or", which is probably not what you want.
please make "AND" default. "OR" is seldom used
at least for the quick search.
Indeed, I'd like that too, would be a great improvement.
Life is like a box of chocolates... Before you know it, it's empty...
Top
David_Escott
l33t
l33t
Posts: 952
Joined: Sun Jan 12, 2003 4:37 pm
Location: Boston, MA

  • Quote

Post by David_Escott » Mon Feb 03, 2003 1:37 pm

I have the compelling urge to say the following:

THIS IS A DUPLICATE THREAD.
PLEASE SEARCH

http://forums.gentoo.org/viewtopic.php?t=30782
was the closest I could get, searching was a pain because well it sucks (sorry that was a duplicate thought)
But I do remember seeing one thread where rac may have explained some of the difficulties in trying to make search work a little better, if only I could find it again :(

Ahhh found it
http://forums.gentoo.org/viewtopic.php?p=167526#167526 down the page rac explains some of the difficulties
Top
rac
Bodhisattva
Bodhisattva
User avatar
Posts: 6553
Joined: Thu May 30, 2002 6:19 am
Location: Japanifornia
Contact:
Contact rac
Website

  • Quote

Post by rac » Mon Feb 03, 2003 10:13 pm

Thanks for those links, David_Escott. As a journey of a thousand miles begins with a single step, I have changed the default setting of the checkbox in the search screen to "Search for all terms". Hopefully this will improve things in some small measure. Note that this change only affects the purple "gentoo" theme - people still using subSilver will not be affected.
For every higher wall, there is a taller ladder
Top
xlyz
Veteran
Veteran
User avatar
Posts: 1470
Joined: Sun Oct 27, 2002 8:04 pm
Location: Italy

  • Quote

Post by xlyz » Mon Feb 03, 2003 10:51 pm

quick search is still "search for any words" as default

are you going to change it as well?

TIA
Top
rac
Bodhisattva
Bodhisattva
User avatar
Posts: 6553
Joined: Thu May 30, 2002 6:19 am
Location: Japanifornia
Contact:
Contact rac
Website

  • Quote

Post by rac » Mon Feb 03, 2003 11:30 pm

Apparently phpBB was caching the template file, so my changes to quicksearch weren't taking hold. It should be fixed now.
For every higher wall, there is a taller ladder
Top
gsfgf
Veteran
Veteran
User avatar
Posts: 1266
Joined: Wed May 08, 2002 3:24 pm

  • Quote

Post by gsfgf » Mon Feb 03, 2003 11:40 pm

Rac i noticed in yopur other post that version #s aren't searchable. If you convert 2.2.2 to 222 and make search strip periods as well so if you search for kde 3.1 search will treat it as kde 31, that would solve that issue. That may be harder than it looks, though.
Top
rac
Bodhisattva
Bodhisattva
User avatar
Posts: 6553
Joined: Thu May 30, 2002 6:19 am
Location: Japanifornia
Contact:
Contact rac
Website

  • Quote

Post by rac » Mon Feb 03, 2003 11:48 pm

gsfgf wrote:If you convert 2.2.2 to 222 and make search strip periods as well so if you search for kde 3.1 search will treat it as kde 31, that would solve that issue. That may be harder than it looks, though.
It is harder than it looks, for a couple of reasons. Either a period causes a word break or it doesn't. Now what would be best is if it caused a word break only if it wasn't a version number, but that could be a challenging regex. Maybe we could steal it from Portage. I think if we're going to go this far, we might as well get it right and have "version numbers" go into the index.

I just had an idea on how this might be implemented. Details later.
For every higher wall, there is a taller ladder
Top
phong
Bodhisattva
Bodhisattva
User avatar
Posts: 778
Joined: Tue Jul 16, 2002 6:51 pm
Location: Michigan - 15 & Ryan
Contact:
Contact phong
Website

  • Quote

Post by phong » Wed Feb 05, 2003 4:35 am

Not that tough of a regex... For matching (in a generic way) words of 3+ characters and version numbers, I might use the following for matching words to put into the index (offhand, could be more robust):
(\b\w{3,}\b|\b\d(\.\d){1,2}\b)

What's the current regexp?
"An empty head is not really empty; it is stuffed with rubbish. Hence the difficulty of forcing anything into an empty head."
-- Eric Hoffer
Top
Lion
Apprentice
Apprentice
User avatar
Posts: 207
Joined: Sun Jun 23, 2002 3:17 pm

Search still broken

  • Quote

Post by Lion » Sat Feb 08, 2003 10:07 am

I think there is still something basically wrong with search.
I always try to include as many relevant search terms in my query, but still I do not get posts that I know to be available.

Simple example: Search for the word 'world'.
'No topics or posts met your search criteria'.
I know this is not true, because the world file is mentioned in many posts.
Search for 'world AND file'.
Thousands of posts, many of which do NO contain the word 'world'.

So, my question is: what am I doing wrong?
Top
rac
Bodhisattva
Bodhisattva
User avatar
Posts: 6553
Joined: Thu May 30, 2002 6:19 am
Location: Japanifornia
Contact:
Contact rac
Website

Re: Search still broken

  • Quote

Post by rac » Sat Feb 08, 2003 7:48 pm

Lion wrote:So, my question is: what am I doing wrong?
Being unlucky. To help keep the size of the search tables down, there is a "stopword list" in phpBB's search function. Words on the stopword list are not indexed because they are too common. Unfortunately for your example, world is on the stopword list, so nothing shows up, and then when you search for "world and file", you are really only searching for 'file'.
For every higher wall, there is a taller ladder
Top
pjp
Administrator
Administrator
User avatar
Posts: 20668
Joined: Tue Apr 16, 2002 10:35 pm

  • Quote

Post by pjp » Sat Feb 08, 2003 8:02 pm

Maybe emerge, portage, gentoo, gnome and alot of others should be added to the list :D
Quis separabit? Quo animo?
Top
xlyz
Veteran
Veteran
User avatar
Posts: 1470
Joined: Sun Oct 27, 2002 8:04 pm
Location: Italy

Re: Search still broken

  • Quote

Post by xlyz » Sat Feb 08, 2003 8:59 pm

rac wrote:
Lion wrote:So, my question is: what am I doing wrong?
Being unlucky. To help keep the size of the search tables down, there is a "stopword list" in phpBB's search function.
what are the words included in the list?
Top
rac
Bodhisattva
Bodhisattva
User avatar
Posts: 6553
Joined: Thu May 30, 2002 6:19 am
Location: Japanifornia
Contact:
Contact rac
Website

Re: Search still broken

  • Quote

Post by rac » Sun Feb 09, 2003 12:46 am

xlyz wrote:what are the words included in the list?
http://forums.gentoo.org/language/lang_ ... pwords.txt
For every higher wall, there is a taller ladder
Top
dufeu
l33t
l33t
User avatar
Posts: 927
Joined: Fri Aug 30, 2002 2:59 pm
Location: US-FL-EST

Re: Search still broken

  • Quote

Post by dufeu » Thu Feb 13, 2003 2:45 am

rac wrote:
xlyz wrote:what are the words included in the list?
http://forums.gentoo.org/language/lang_ ... pwords.txt
I assume that gentoo bugzilla has a similar stopword list? It certainly would explain some of my difficulties in using search there.

Do you have an inkling (and could you share the location) of the search stopword list there?
People whom think M$ is mediocre, don't know the half of it.
Top
rac
Bodhisattva
Bodhisattva
User avatar
Posts: 6553
Joined: Thu May 30, 2002 6:19 am
Location: Japanifornia
Contact:
Contact rac
Website

Re: Search still broken

  • Quote

Post by rac » Thu Feb 13, 2003 6:04 pm

dufeu wrote:I assume that gentoo bugzilla has a similar stopword list?
Bugzilla's completely different software. I don't know off the top of my head whether there's a stopword list. If I get some time I may look into it further.
For every higher wall, there is a taller ladder
Top
edoloughlin
n00b
n00b
Posts: 12
Joined: Wed Oct 02, 2002 9:13 am
Location: Ireland

Re: Search still broken

  • Quote

Post by edoloughlin » Wed Feb 19, 2003 10:16 am

rac wrote:
Lion wrote:So, my question is: what am I doing wrong?
Being unlucky. To help keep the size of the search tables down, there is a "stopword list" in phpBB's search function. Words on the stopword list are not indexed because they are too common. Unfortunately for your example, world is on the stopword list, so nothing shows up, and then when you search for "world and file", you are really only searching for 'file'.
Perhaps a lesson could be learned from Google. Stopwords are identified if they are included in a search, viz:
"the" is a very common word and was not included in your search
Top
Post Reply
  • Print view

346 posts
  • Page 1 of 14
    • Jump to page:
  • 1
  • 2
  • 3
  • 4
  • 5
  • …
  • 14
  • Next

Return to “Gentoo Forums Feedback”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic