Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Other Things Gentoo
  • Search

Wiki server overwhelmed by scrapers

Still need help with Gentoo, and your question doesn't fit in the above forums? Here is your last bastion of hope.
Post Reply
Advanced search
9 posts • Page 1 of 1
Author
Message
flexibeast
l33t
l33t
Posts: 679
Joined: Mon Apr 04, 2022 4:15 am
Location: Naarm/Melbourne, Australia
Contact:
Contact flexibeast
Website

Wiki server overwhelmed by scrapers

  • Quote

Post by flexibeast » Wed Feb 18, 2026 11:45 pm

Hi all,

A heads-up, quoting the "News" column on the "Main Page" of the wiki:
The wiki has been enduring heavy scraper bot activity recently, to the point that the site has been repeatedly overwhelmed. These overloads have been causing errors such as 500 Internal Server, Bad Gateway (502), Gateway Timeout (504), or long or incomplete page loading. The infrastructure team has been working to keep the servers running through all this, and hopefully the issues will be contained.

The core services powering Gentoo, connectivity, parts, repair, remote hands, hosting, and power all need to be paid for by Gentoo - any donations to help keep the lights on are appreciated.
https://wiki.gentoo.org/wiki/User:Flexibeast
My most recent wiki contributions
Top
s0ulslack1
n00b
n00b
Posts: 31
Joined: Sun Mar 06, 2022 4:29 am

  • Quote

Post by s0ulslack1 » Thu Feb 19, 2026 12:20 am

Maybe an rss type setup could help?
Top
flexibeast
l33t
l33t
Posts: 679
Joined: Mon Apr 04, 2022 4:15 am
Location: Naarm/Melbourne, Australia
Contact:
Contact flexibeast
Website

  • Quote

Post by flexibeast » Thu Feb 19, 2026 12:42 am

s0ulslack1 wrote:Maybe an rss type setup could help?
Sorry, i'm not sure i follow .... ? What does RSS have to do with things like (say) AI bots scraping content to use as training data?

If a bot is merely wanting to know the latest changes, it could simply examine the "Recent Changes" page. i'm not aware that this is available as an RSS/Atom feed; if it's not, i imagine there's a MediaWiki plugin that could provide one. But at any rate, the format of the "Recent Changes" page is consistent and could be 'parsed' (in the broad sense) for the required information.
https://wiki.gentoo.org/wiki/User:Flexibeast
My most recent wiki contributions
Top
Banana
Moderator
Moderator
User avatar
Posts: 2378
Joined: Fri May 21, 2004 12:02 pm
Location: Germany
Contact:
Contact Banana
Website

  • Quote

Post by Banana » Thu Feb 19, 2026 7:29 am

if the server serving is overwhelmed any other solution served by this server is also affected. RSS as another solution will not help, only if it is server elsewhere.
Forum Guidelines

PFL - Portage file list - find which package a file or command belongs to.
My delta-labs.org snippets do expire
Top
NeddySeagoon
Administrator
Administrator
User avatar
Posts: 56077
Joined: Sat Jul 05, 2003 9:37 am
Location: 56N 3W

  • Quote

Post by NeddySeagoon » Thu Feb 19, 2026 9:47 am

Bugs and the forums get hit with the same problem.
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Top
gorg86
Guru
Guru
User avatar
Posts: 359
Joined: Fri May 20, 2011 6:20 pm

  • Quote

Post by gorg86 » Thu Feb 19, 2026 7:20 pm

The last 2 weeks I had to read through a lot of wikis and that was very annoying.
I'm not in charge of the infrastructure ofc, but I'm certain that is being caused by bots that scrape for AI.
We need to find a way to block those bad actors.
Other sites are suffering from this, too.
Top
Chiitoo
Administrator
Administrator
User avatar
Posts: 3048
Joined: Sun Feb 28, 2010 5:36 pm
Location: Sore wa sore, kore wa kore... nanoda.

  • Quote

Post by Chiitoo » Thu Feb 19, 2026 7:24 pm

We are playing whack-a-scraper a lot in the background, but can't quite keep up with it.

The other option is something like Anubis, which will be very restrictive, though I want to get back to trying it out and configure it so all the real people will get in.
Kindest of regardses.
Top
gorg86
Guru
Guru
User avatar
Posts: 359
Joined: Fri May 20, 2011 6:20 pm

  • Quote

Post by gorg86 » Thu Feb 19, 2026 7:28 pm

I while back I read a post on the level1techs forum (I think), where someone managed to solve the AI bot issue without breaking stuff.
I'll try to dig it up.
EDIT:
Close enough, it was Geoffrey McRae and he had some help: https://x.com/geoffrey_mcrae/status/2004062396697268619
I'm not a fan of Cloudflare...
Top
Zucca
Moderator
Moderator
User avatar
Posts: 4690
Joined: Thu Jun 14, 2007 10:31 pm
Location: Rasi, Finland
Contact:
Contact Zucca
Website

  • Quote

Post by Zucca » Thu Feb 19, 2026 8:23 pm

gorg86 wrote:Close enough, it was Geoffrey McRae and he had some help: https://x.com/geoffrey_mcrae/status/2004062396697268619
I'm not a fan of Cloudflare...
There has been some talk about Crowd Sec before. It is, for sure, an interesting project.
..: Zucca :..

Code: Select all

init=/sbin/openrc-init
-systemd -logind -elogind seatd
I am NaN! I am a man!
Top
Post Reply

9 posts • Page 1 of 1

Return to “Other Things Gentoo”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic