XeIaso wrote:Hi, please forgive me in advance. I tried to format this in BBCode but gave up and will be using Markdown to format this instead.
Neddy gave good advice on how to do this by hand. The easiest approach though is just to choose to reply-with-quote, and let the forum prepopulate the response with the appropriate quote tags.
XeIaso wrote:I have been working on something like this, but I have also been getting a lot of abuse and hate recently and something that opens the door to getting more of it is really not on the front burner.
I am not proposing making it easier for people to submit reports. I only propose making it easy for them to gather information so that, if they choose to manually submit a report, it is a good one. Your reference in the next paragraph (not quoted) to the lgbt server would satisfy this nicely.
As for abuse, I'm not surprised. The user experience on a false-positive is terrible. The block page advertises Anubis and CELPHASE, but does
not direct users how to contact the operator who actually installed the misbehaving Anubis instance. Lacking any other directions, I'd expect users to come straight to you, even though you cannot help them. At least some sites that use Anubis will be used by people who are inclined to shoot first and ask questions later. This is further exacerbated by the fact that users hit this block page when they're trying to do useful work, and Anubis steps in and smacks them down without adequate explanation. It took very few false positives with Anubis for me to come to despise it.
Maybe CELPHASE asked to be credited on the block page, but personally, I wouldn't want my name and contact information advertised on the page shown to users who just got blocked, and are now upset that they can't get the document they requested. That seems like setting up for CELPHASE to catch some unjustified collateral abuse.
This is good to know. However, (1) it requires a GitHub account (which in turn has quite a long Terms of Service document to read) and (2) from what I recall of the Anubis failure page, there is no mention in that page about this. Anyone who was blocked is thus left with no way to learn what they should do to get unblocked. This goes to a more general problem with the block pages. They should direct people how to contact the guilty operator, not encourage people to go directly to you.
XeIaso wrote:> I had not done anything to make myself suspicious, and Anubis always went straight to the JavaScript challenge.
Can you make a complete list of every single thing you have done to your browser from the unmodified out of the box defaults including extensions, settings changed, and other things?
No. I don't keep my browser profile under source control, and I've had it long enough that I cannot tell you from memory everything I've ever needed to change. However, I can refer you back
to the original post of this thread. As I understood that poster, an out-of-the-box firefox-bin on Gentoo in a VM was blocked by Anubis. I will caution that it sounds like that was actually a harder block than I hit. My problem was "only" that Anubis was insisting on a CPU-wasting proof-of-work that NoScript correctly blocked. OP's problem reads more like he was dropped into a dead end from which there was no exit, even if he did run expensive script.
XeIaso wrote:I mainly test against the default out of the box configuration because I am a single person working on this and my test matrix is already impossibly large as it is. Extensions like NoScript, JShelter, whatever cookie blocker, etc cause me to have to exponentially increase the size of that testing matrix and I haven't had the time to make a proper integration jungle like I've wanted.
I am a devoted NoScript user from way back, and believe the world would be a better place if browser vendors had shipped, as a standard feature, at least a basic version of NoScript twenty years ago, before the current generation of site authors developed the mindset of just assuming every browser will run every crazy script referenced in the page. If the site authors had to stop and think about whether the page would work for a first-time user who hasn't yet been convinced to allow scripts, maybe the experience with scripts blocked would not be quite so bad.
Yes, I read that post a few weeks ago. I spent a fair bit of time reading up on Anubis when I realized what a problem it was becoming.
XeIaso wrote:> That setting merely disables automatic refresh. When the Anubis guarding gcc's bugzilla was upgraded to offer metarefresh[1], I had to hit Allow on the refresh for Firefox to follow it, and then it worked.
What extension does that behaviour? Why are you using this extension in general? I'd like to know so I can adjust my testing matrix appropriately.
There probably is some extension to slap a nice GUI on this, but I likely flipped it right in
about:config. I set this a long time ago because I don't like the user experience of being bounced through multiple pages without warning. I don't think there is anything Anubis can or should do with regard to this setting. Anubis sent a meta-refresh redirect. Firefox obeyed my preference to ask for permission before following it. I granted that permission. Firefox followed the redirect, and Anubis allowed me to access the requested content. This was a substantial improvement over the earlier Anubis version that dumped me into a proof-of-work challenge when trying to read any gcc bug reports.
XeIaso wrote:> A cynical interpretation of this data would be that users on low-end systems are not welcome on Anubis-protected sites.
My intent behind working on Anubis is to keep websites online. Websites that are offline cannot teach anyone anything.
Sites that indefinitely block someone with no explanation cannot be used by affected users, either.
Proof of React is an interesting idea to cut down on the CPU load, but it still has the critical flaw that it
requires users to run JavaScript. As long as you cling to solutions that require JavaScript, you will continue to have problems with NoScript users. Further, since Anubis has built up a reputation for its JavaScript being very expensive (such as what keeps happening to pa4wdh with needing 30 seconds(!) to complete one challenge), I expect that people who use NoScript will not be eager to go around granting Anubis permission to run
any scripts. On the bright side, perhaps the repeated CPU abuse from Anubis will encourage more people to install NoScript.
XeIaso wrote:This is fairly opaque to the client by design to avoid "leaking" details of how the administrator configured it.
Yes, this is one place where security through obscurity is worth at least a small amount.
XeIaso wrote:The meta refresh challenge is a work in progress because the abusive scrapers are starting to learn how to handle it
Although unfortunate, that is entirely expected.
XeIaso wrote:I can look into a way to make sure that people can "fall back" to the meta refresh challenge, but this would be something that would be off by default and administrators would need to choose to enable it.
I thought Anubis preferred meta-refresh for any clients it deemed sufficiently low risk, in which case anyone who doesn't get meta-refresh now is by definition too risky to allow them to use it even as a manually initiated fallback.
XeIaso wrote:It is just slow going because this is frankly a difficult problem.
Yes. I think that you will never be able to build a system that the scrapers cannot solve if they care to try. At best, you will build a system like the JavaScript proof-of-work, where you antagonize everyone equally with busy work, and then hope that the scrapers aren't willing to do enough busy work to get all the content they want. At some point, the scrapers may switch to running a full browser under Xvfb and driving it via an extension, at which point Anubis will conclude the scraper is a "real browser" and let it through. Yes, it's expensive CPU wise, but I expect anyone who has the CPU capacity to run the LLM on the backend after training can afford the CPU time to solve proof-of-work challenges during the training collection phase.