Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
F- forums performance
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo Forums Feedback
View previous topic :: View next topic  

block crawlers
yes
0%
 0%  [ 0 ]
no
37%
 37%  [ 3 ]
selectively from non essential forums
62%
 62%  [ 5 ]
Total Votes : 8

Author Message
666threesixes666
Veteran
Veteran


Joined: 31 May 2011
Posts: 1248
Location: 42.68n 85.41w

PostPosted: Wed Mar 05, 2014 10:23 am    Post subject: F- forums performance Reply with quote

desultory says that crawlers are clogging up the pipe.

OTW, gentoo chat, dustbin, duplicate threads, & forums feedback should be blocked from crawlers.

otw is a retard carnival, google shouldnt index users feeding back, chat doesnt have serious technical discussions going on, duplicate threads is obvious, dustbin is obvious. GLSA should not be crawled either, its duplicated elsewhere.

blocking the crawling all together is the greater of 2 evils? id say google dos attacking the forum and taking all functionality away is the greater of 2 evils.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9601
Location: almost Mile High in the USA

PostPosted: Wed Mar 05, 2014 3:21 pm    Post subject: Reply with quote

I was surprised that I find my posts on google fairly quickly on f.g.o. It's not the only site I visit that seems to get instant indexed by google, and I'm quite shocked, this must be pretty bad on forum software/networks.

Can this indexing be rate limited somehow? Since I'm not an admin I have no clue how bad it is compared to regular traffic (I do see the bots on my servers but I don't have as much content...)
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
SirRobin2318
Apprentice
Apprentice


Joined: 24 Apr 2004
Posts: 241
Location: Strasbourg, france.

PostPosted: Wed Mar 05, 2014 3:46 pm    Post subject: Reply with quote

The forum search is useless, we need the crawlers, and I'm not sure that bocking non essential forums would really help. Worth a try.
Back to top
View user's profile Send private message
desultory
Bodhisattva
Bodhisattva


Joined: 04 Nov 2005
Posts: 9410

PostPosted: Thu Mar 06, 2014 4:28 am    Post subject: Reply with quote

666threesixes666 wrote:
id say google dos attacking the forum and taking all functionality away is the greater of 2 evils.
Just to clarify, for those not present when we discussed this on IRC, the periods when the site gets swamped by crawlers are rarely due to any single crawler anymore. So far as I am aware, none have been specifically due to Googlebot since we restricted some of the more dynamic and resource intensive to generate pages: user lists, general search, and so on. It tends to, apparently, occur when multiple separate crawlers from multiple separate search engines end up attempting to perform many request more or less simultaneously.
Back to top
View user's profile Send private message
PaulBredbury
Watchman
Watchman


Joined: 14 Jul 2005
Posts: 7310

PostPosted: Thu Mar 06, 2014 1:40 pm    Post subject: Reply with quote

Collect the IP addresses of known bots, then rate-limit their new connections as a group, using iptables.

"man iptables", see sections on limit and connlimit. (Edit:) Also recent.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Forums Feedback All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum