Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Discussion & Documentation Gentoo Forums Feedback
  • Search

new search stopwords list

Having a problem with the forums? Have a suggestion? Post here!
Post Reply
  • Print view
Advanced search
138 posts
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • Next
Author
Message
rac
Bodhisattva
Bodhisattva
User avatar
Posts: 6553
Joined: Thu May 30, 2002 6:19 am
Location: Japanifornia
Contact:
Contact rac
Website

new search stopwords list

  • Quote

Post by rac » Wed Sep 15, 2004 10:14 pm

We've analyzed the most commonly occurring words on the forums, and made some additions to the stopword list. Attempting to search using any of these words won't return any posts, and if you combine a stopword with other legitimate terms, the stopword just gets ignored.

Here's the current list, including both upstream phpBB's entry and ours:

AFAIK
I
IIRC
Ive
LOL
ROTF
ROTFLMAO
YMMV
a
aber
able
about
above
access
actually
add
after
again
ago
all
almost
along
alot
already
also
always
am
amp
an
and
and
another
answer
any
anybody
anybodys
anyone
anything
anyway
anywhere
are
arent
around
as
ask
askd
at
auch
auf
available
back
bad
be
because
been
before
being
believe
best
better
between
big
bit
both
box
btw
bug
build
but
but
by
can
cannot
cant
card
case
change
che
check
code
come
command
compile
compiled
compiling
computer
con
configuration
correct
could
couldnt
course
create
das
day
days
days
default
den
der
desktop
did
didnt
die
different
do
does
doesnt
doing
done
dont
down
drive
each
edit
either
else
emerged
end
enough
errors
etc
even
ever
every
everybody
everybodys
everyone
everything
exactly
example
failed
far
few
file
files
find
fine
first
fix
fixed
following
for
for
forum
forums
found
from
function
gentoo
get
getting
give
go
going
gone
good
got
gotten
great
guess
had
hard
hardware
has
have
have
havent
having
help
her
here
hers
him
his
home
hope
how
however
hows
href
ich
idea
ideas
if
ill
in
info
ini
install
installation
installed
installing
instead
into
is
isnt
issue
ist
it
its
ive
just
keep
know
large
last
latest
least
less
let
lib
like
liked
line
link
linux
list
little
load
local
log
lol
long
look
looked
looking
looking
looks
lot
machine
made
mal
man
many
may
maybe
me
mean
message
might
mit
mode
more
most
much
must
mustnt
my
name
near
need
net
network
never
new
news
next
nice
nicht
no
non
none
not
nothing
now
of
off
often
old
on
once
one
only
oops
open
option
options
or
org
other
our
ours
out
output
over
own
package
packages
page
part
pas
people
per
play
please
point
possible
post
pretty
probably
problem
problems
program
put
que
question
questioned
questions
quite
quot
quote
rather
read
really
reason
recent
remember
right
run
said

same
saw
say
says
screen
script
see
seem
seems
sees
server
set
setting
settings
setup
she
should
since
sites
small
so
software
solution
some
someone
something
sometime
somewhere
soon
sorry
source
start
started
still
stuff
such
support
sure
take
tell
than
thank
thanks
that
thatd
thats
the
the
their
theirs
them
then
there
theres
these
they
theyd
theyll
theyre
thing
things
think
this
this
those
though
thought
thread
through
thus
time
times
to
too
tried
true
try
trying
two
type
und
under
until
untrue
up
update
upon
use
used
user
users
using
usr
version
very
via
want
was
way
we
well
went
were
werent
what
whats
when
where
which
while
who
whom
whose
why
wide
will
wink
with
where
which
while
who
whom
whose
why
wide
will
wink
with
with
within
without
wont
work
worked
working
works
world
worse
worst
would
wrong
wrote
www
yes
yet
you
you
youd
youll
your
youre
yours
For every higher wall, there is a taller ladder
Top
klieber
Bodhisattva
Bodhisattva
User avatar
Posts: 3657
Joined: Wed Apr 17, 2002 4:48 pm
Location: San Francisco, CA
Contact:
Contact klieber
Website

  • Quote

Post by klieber » Thu Sep 16, 2004 12:11 am

To follow up on rac's post, the reason we did this was to reduce the size of our search database in mysql. It was overwhelming the database server and causing the slowdowns that people have been experiencing recently.

--kurt
The problem with political jokes is that they get elected
Top
kamagurka
Veteran
Veteran
User avatar
Posts: 1026
Joined: Sun Jan 25, 2004 1:55 am
Location: /germany/munich
Contact:
Contact kamagurka
Website

  • Quote

Post by kamagurka » Mon Sep 20, 2004 6:05 pm

would it be possible to have the search throw an informative error when searching for stopwords instead of just saying "no posts found"?
If you loved me, you'd all kill yourselves today.
--Spider Jerusalem, the Word
Top
rac
Bodhisattva
Bodhisattva
User avatar
Posts: 6553
Joined: Thu May 30, 2002 6:19 am
Location: Japanifornia
Contact:
Contact rac
Website

  • Quote

Post by rac » Mon Sep 20, 2004 6:26 pm

It might be possible to change the message to something like "none of your search terms were usable" in the case where you enter only stopwords. Telling you that some of your terms were used, but not others, would be considerably harder.
For every higher wall, there is a taller ladder
Top
Gentree
Watchman
Watchman
User avatar
Posts: 5350
Joined: Tue Jul 01, 2003 12:51 am
Location: France, Old Europe

  • Quote

Post by Gentree » Sat Oct 23, 2004 7:49 pm

klieber wrote:To follow up on rac's post, the reason we did this was to reduce the size of our search database in mysql. It was overwhelming the database server and causing the slowdowns that people have been experiencing recently.

--kurt
You may like to consider how much the lack of an effective search tool is burdgeoning the database.

People cant find what's there, make a new post and there's a new thread of 10 or 20 posts.

Before too long this will become unmanagable and the forum will break.

Without the forum Gentoo would be of limitted use.

I have made concrete suggestions in other posts today.

HTH 8)
Linux, because I'd rather own a free OS than steal one that's not worth paying for.
Gentoo because I'm a masochist
AthlonXP-M on A7N8X. Portage ~x86
Top
Deathwing00
Bodhisattva
Bodhisattva
User avatar
Posts: 4087
Joined: Fri Jun 13, 2003 9:07 pm
Location: Berlin, Germany
Contact:
Contact Deathwing00
Website

  • Quote

Post by Deathwing00 » Sun Oct 24, 2004 1:27 am

I made this one sticky... I think it's important to know what words are filtered.
Top
c45207
n00b
n00b
Posts: 70
Joined: Mon Mar 08, 2004 3:43 am

  • Quote

Post by c45207 » Thu Jan 27, 2005 3:25 am

Is there any way to override this? For example, today I wanted to find "You have new mail in". However, only mail is a searchable word, so I go lots of useless posts.
Top
ian!
Bodhisattva
Bodhisattva
User avatar
Posts: 3829
Joined: Tue Feb 25, 2003 9:52 am
Location: Essen, Germany
Contact:
Contact ian!
Website

  • Quote

Post by ian! » Thu Jan 27, 2005 7:06 am

c45207 wrote:Is there any way to override this?
No.
"To have a successful open source project, you need to be at least somewhat successful at getting along with people." -- Daniel Robbins
Top
Wicked Wesley
n00b
n00b
User avatar
Posts: 70
Joined: Thu May 20, 2004 6:32 pm
Location: Here
Contact:
Contact Wicked Wesley
Website

  • Quote

Post by Wicked Wesley » Fri Jan 28, 2005 4:50 pm

Just to let you know, the word but is in there twice!

Have a nice day!
The Jester!
Linux user 357122!
Top
knefas
l33t
l33t
User avatar
Posts: 828
Joined: Sun Dec 21, 2003 1:20 am

  • Quote

Post by knefas » Fri Jan 28, 2005 5:25 pm

Ohh...also two days, have and this :)
Top
masseya
Bodhisattva
Bodhisattva
User avatar
Posts: 2602
Joined: Wed Apr 17, 2002 3:56 pm
Location: Baltimore, MD
Contact:
Contact masseya
Website

  • Quote

Post by masseya » Fri Jan 28, 2005 10:49 pm

Those are particularly insidious words that absolutely have to be stopped so we put the second entry in the stopwords list sort of as a way to add injury to insult for the many weeks of futile searching those words have caused.
if i never try anything, i never learn anything..
if i never take a risk, i stay where i am..
Top
Anior
Guru
Guru
User avatar
Posts: 317
Joined: Thu Apr 17, 2003 10:52 pm
Location: European Union (Stockholm / Sweden)

  • Quote

Post by Anior » Sat Jan 29, 2005 2:56 am

c45207 wrote:Is there any way to override this? For example, today I wanted to find "You have new mail in". However, only mail is a searchable word, so I go lots of useless posts.
You can use google to search the forums, even if you'll only get hits from those posts which has been indexed.

http://www.google.com/search?hl=en&q=si ... ew+mail%22
Top
SubAtomic
Apprentice
Apprentice
User avatar
Posts: 255
Joined: Sat Dec 20, 2003 6:33 am
Location: Hobart, TAS, Australia

  • Quote

Post by SubAtomic » Thu Feb 10, 2005 3:24 am

What about RTFM and rtfm, IMHO and imho?

Would a "Suggest words to add to the stopwords list" thread topic (possibly in the Feedback section) be of use? Im thinking of something similar to the report spammers thread.
"The real romance is out ahead and yet to come. The computer revolution hasn't started yet. Don't be misled by the enormous flow of money into bad defacto standards for unsophisticated buyers using poor adaptations of incomplete ideas." -- Alan Kay
Top
cokey
Advocate
Advocate
User avatar
Posts: 3355
Joined: Fri Apr 23, 2004 12:30 am

  • Quote

Post by cokey » Thu Mar 24, 2005 12:07 pm

I think "compile" and "error(s)" should be taken out, after all this is gentoo not SuSE
https://otw20.com/ OTW20 The new place for off the wall chat
Top
masseya
Bodhisattva
Bodhisattva
User avatar
Posts: 2602
Joined: Wed Apr 17, 2002 3:56 pm
Location: Baltimore, MD
Contact:
Contact masseya
Website

  • Quote

Post by masseya » Thu Mar 24, 2005 11:02 pm

cokehabit wrote:I think "compile" and "error(s)" should be taken out, after all this is gentoo not SuSE
The reason these words are on the list is that they are too commonly appearing to actually be of use in identifying a particular thread. There are so many posts with the words 'compile' or 'error' that it's not a useful descriptor. If I were trying to describe myself to you so you could pick me out of a crowd at an amusement park I would want to avoid a description such as "medium height with blue jeans, sneakers and a tshirt" because it wouldn't really tell you anything that would set me apart from virtually everyone else. This is essentially the kind of description you get when searching for the words 'compile' and 'error'.
if i never try anything, i never learn anything..
if i never take a risk, i stay where i am..
Top
kallamej
Administrator
Administrator
User avatar
Posts: 4993
Joined: Fri Jun 27, 2003 10:05 am
Location: Gothenburg, Sweden

  • Quote

Post by kallamej » Thu Mar 24, 2005 11:15 pm

Heh, error is not in the list, actually.
Please read our FAQ Forum, it answers many of your questions.
irc: #gentoo-forums on irc.libera.chat
Top
cokey
Advocate
Advocate
User avatar
Posts: 3355
Joined: Fri Apr 23, 2004 12:30 am

  • Quote

Post by cokey » Fri Mar 25, 2005 7:52 am

kallamej wrote:Heh, error is not in the list, actually.
errors is so i put it in bracket(s)
https://otw20.com/ OTW20 The new place for off the wall chat
Top
masseya
Bodhisattva
Bodhisattva
User avatar
Posts: 2602
Joined: Wed Apr 17, 2002 3:56 pm
Location: Baltimore, MD
Contact:
Contact masseya
Website

  • Quote

Post by masseya » Fri Mar 25, 2005 6:56 pm

kallamej wrote:Heh, error is not in the list, actually.
lol.. We should, like, add that and stuff.
if i never try anything, i never learn anything..
if i never take a risk, i stay where i am..
Top
cokey
Advocate
Advocate
User avatar
Posts: 3355
Joined: Fri Apr 23, 2004 12:30 am

  • Quote

Post by cokey » Fri Mar 25, 2005 7:05 pm

is there any way to make the gentoo forums searchable through google like wikipedia is? Perhaps somone could speak to them? That would sort out the seach database while offering google free advertising every time someone searches through gentoo.
https://otw20.com/ OTW20 The new place for off the wall chat
Top
kallamej
Administrator
Administrator
User avatar
Posts: 4993
Joined: Fri Jun 27, 2003 10:05 am
Location: Gothenburg, Sweden

  • Quote

Post by kallamej » Fri Mar 25, 2005 7:54 pm

Yes, the forums are google searchable, but there are only about 30K pages indexed. It's increasing quite nicely since the urls got html-ised, though.
Please read our FAQ Forum, it answers many of your questions.
irc: #gentoo-forums on irc.libera.chat
Top
Satori80
Tux's lil' helper
Tux's lil' helper
Posts: 137
Joined: Tue Feb 24, 2004 6:33 pm

  • Quote

Post by Satori80 » Sat Apr 09, 2005 11:19 am

Why don't you guys try to get a consensus? I for one would rather have a slow useful search database than a quick irrelevant one.
Top
curtis119
Bodhisattva
Bodhisattva
User avatar
Posts: 2160
Joined: Mon Mar 10, 2003 4:41 pm
Location: Toledo, Ohio,USA, North America, Earth, SOL System, Milky Way, The Universe, The Cosmos, and Beyond.

  • Quote

Post by curtis119 » Sat Apr 09, 2005 11:30 am

Satori80 wrote:Why don't you guys try to get a consensus? I for one would rather have a slow useful search database than a quick irrelevant one.
The stop words list is attempting to do both. A quick and relevant search. It's gotten so much better since rac and ian! starting actively doing this. I search constantly and have noticed a significant difference in quality of results.
Gentoo: it's like wiping your ass with silk.
Top
Satori80
Tux's lil' helper
Tux's lil' helper
Posts: 137
Joined: Tue Feb 24, 2004 6:33 pm

  • Quote

Post by Satori80 » Sat Apr 09, 2005 11:34 am

masseya wrote:
cokehabit wrote:I think "compile" and "error(s)" should be taken out, after all this is gentoo not SuSE
The reason these words are on the list is that they are too commonly appearing to actually be of use in identifying a particular thread. There are so many posts with the words 'compile' or 'error' that it's not a useful descriptor.
It isn't the words in and of themselves that make them useful or not. It's the use of the words in combination with other specific words. For instance, those generated in an error message. If the search finds all the terms in the error message, you can quickly find the subject of your concern. Without the right words at your disposal, you'll have to fish around through irrelevant topics to try and find what you need to get your system back on its feet. I've found myself in this second situation more often than usual the past few days – more than once without resolution to my issue. Now I know why. It isn't because the issue isn't in the forums, it's because it can't be found due to a flaky search. And frankly, I'm pissed about it.

There is a reason error messages are generated in the first place. If you can't make the forums able to find specific input then why bother devoting the resources to keep them online? I always used the forums as a troubleshooting tool in the past. Apparently, I can no longer do that. Too bad for me, huh?
Top
Satori80
Tux's lil' helper
Tux's lil' helper
Posts: 137
Joined: Tue Feb 24, 2004 6:33 pm

  • Quote

Post by Satori80 » Sat Apr 09, 2005 11:54 am

Look, I'm sorry if that last post came off as crass. I wasn't trying to insult anybody, and I didn't mean it as directing my frustration on any one person in particular.

But the sentiment is valid. I mean look at that list. "Man" is in the list? If I have an issue with the "man" program I can't directly look for a resolution to my issue in these forums? Common guys, give us a fighting chance.
Top
cokey
Advocate
Advocate
User avatar
Posts: 3355
Joined: Fri Apr 23, 2004 12:30 am

  • Quote

Post by cokey » Sat Apr 09, 2005 12:11 pm

curtis119 wrote:
Satori80 wrote:Why don't you guys try to get a consensus? I for one would rather have a slow useful search database than a quick irrelevant one.
The stop words list is attempting to do both. A quick and relevant search. It's gotten so much better since rac and ian! starting actively doing this. I search constantly and have noticed a significant difference in quality of results.
I've noticed the opposite, i continually miss threads or have no threads come up at all where i would expect at least a few. VERY infuriating if you cannot ONE SINGLE THREAD up. It just makes it look broken.
https://otw20.com/ OTW20 The new place for off the wall chat
Top
Post Reply
  • Print view

138 posts
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • Next

Return to “Gentoo Forums Feedback”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic