Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
phpBB hacker feedback wanted
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo Forums Feedback
View previous topic :: View next topic  
Author Message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Mon Nov 25, 2002 11:48 pm    Post subject: phpBB hacker feedback wanted Reply with quote

In studying some of the performance bottlenecks that the forums are experiencing, I have found one area that looks outrageously wasteful to me, and I have witnessed it for all intents and purposes making the forums unusable when it was running, as it grabbed a lock on a key search table, causing other clients to pile up behind it.

I speak of the code in remove_search_post (includes/functions_search.php approx. line 353) that removes from the search wordlist all words from a given post that is being deleted, if that word was unique. This looks like it would be a massive query, especially for large posts, as my understanding is it searches for matches all across all posts for every single word in the post being deleted.

First, can somebody familiar with the phpBB code verify that my understanding of the purpose and scope of this query is accurate? The search_wordmatch table for the current state of the forums is 5 million+ records.

Second, if we were to disable this work, would there be any adverse effects other than potential waste of space in the wordlist table by words that were only used in posts that have since been deleted? If so, this seems to me an excellent tradeoff. That table has <200K records, and I don't mind wasting some spots in there at all. Besides, a monthly cron job could cross-reference the wordmatch table with the posts table to find these holes and remove them, if desired.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
carambola5
Apprentice
Apprentice


Joined: 10 Jul 2002
Posts: 214

PostPosted: Tue Nov 26, 2002 5:45 am    Post subject: Reply with quote

Judging from my quick view over at www.phpbb.com/bugs It doesn't look like you filled a bug report. Might wanna do that. I'm willing to bet that the phpBB folks are quite familiar with their own code :D
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Tue Nov 26, 2002 5:53 am    Post subject: Reply with quote

I thought it a bit presumptuous to call it a bug, because I'm not sure that I understand exactly what it's trying to do, and if I do understand it, then it's doing what it should be doing. It's just that it seems like an awfully costly way of doing it.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
klieber
Bodhisattva
Bodhisattva


Joined: 17 Apr 2002
Posts: 3657
Location: San Francisco, CA

PostPosted: Tue Nov 26, 2002 12:53 pm    Post subject: Reply with quote

I rummaged around over at phpBB and came up with this thread which seems to validate what you're suggesting.

It's a PostgreSQL thread, but I imagine the same basic concept would apply to MySQL as well.

--kurt
_________________
The problem with political jokes is that they get elected
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Tue Nov 26, 2002 7:11 pm    Post subject: Reply with quote

Thanks klieber, that looks exactly like the issue I'm talking about. The situation in MySQL is slightly different (and actually slightly worse) because MySQL cannot support subselects. So for MySQL, one SELECT gets a list of all words in the post being deleted, a second SELECT gets all of them that are in that list across the entire search database (this is the real killer, I think) and then a DELETE wipes out the ones that were only present in the deleted post. I will look into patching our board to defang this hotspot.

EDIT: I have tested a patch that comments out the section in question on another server running phpBB, and it acts like I expected. Unused and unreachable words remain in the wordlist, taking up space. However, they will never hit anything. There should be no user-visible repercussions of applying this patch. Deleted threads will not begin being hit by search results, which was my one minor concern.

EDIT2: This patch has now been applied to forums.gentoo.org - if anybody notices any strange new behavior that they think might be related, please post to this thread.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
lars_msh
n00b
n00b


Joined: 25 Aug 2002
Posts: 50
Location: Airstrip One

PostPosted: Wed Nov 27, 2002 11:04 pm    Post subject: Reply with quote

Interesting stuff! So, what did you do? It looks to me like a case of removing/commenting the lines of code in functions_search.php so the case statement for 'mysql' and 'mysql4' does nothing.

Or is there more to it than that?

I have a MySQL database and could live with a few dead words in there...
Back to top
View user's profile Send private message
rac
Bodhisattva
Bodhisattva


Joined: 30 May 2002
Posts: 6553
Location: Japanifornia

PostPosted: Wed Nov 27, 2002 11:08 pm    Post subject: Reply with quote

I commented out the entire switch( SQL_LAYER ) {} in remove_search_post.
_________________
For every higher wall, there is a taller ladder
Back to top
View user's profile Send private message
lars_msh
n00b
n00b


Joined: 25 Aug 2002
Posts: 50
Location: Airstrip One

PostPosted: Wed Nov 27, 2002 11:21 pm    Post subject: Reply with quote

Yeah that ought to do it, thanks. :-)

Footnote:

In a bizarre coincidence, the very same POP mail poll brought in my reply notification for this topic... and my first ever phpBB bug update for a bug I reported months ago. I wondered what was going on!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Forums Feedback All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum