"If any client requests /wp-admin, flag their IP ASN as bot" You are going to hi...

afandian · on Oct 24, 2024

Why? Who is legitimately going to that address but the site admin?

PeterStuer · on Oct 24, 2024

If you ban an IP or even an ASN, there could be (many) thousands sharing that same identifier. Some kid will unknowingly run some free game that does some lightweight scraping in the background as monetization and you ban the whole ISP?

notachatbot123 · on Oct 24, 2024

> some free game that does some lightweight scraping in the background as monetization

What in the flying **. Is this a common thing?

dns_snek · on Oct 24, 2024

For some definition of "common", yes. Some try to be less shady by asking for consent (e.g. in exchange for in-game credits), others are essentially malware.

For example: https://bright-sdk.com/

> Bright SDK is approved by Apple, Amazon, LG, Huawei, Samsung app stores, and is whitelisted by top Antivirus companies.

chatmasta · on Oct 24, 2024

FYI this is a rebranding of the notorious “Luminati” service that sold a residential proxy network based on the ”Hola VPN” chrome extension. They’ve upped their game and now pay application developers to embed their botnet in their application.

naikrovek · on Oct 24, 2024

the idea that games should be written solely to extract revenue from players is so repulsive to me that I actively disrespect and disfavor people I know who work on things like this.

humans are a truly horrible species and this kind of thing is a great example of why I believe that.

internet101010 · on Oct 24, 2024

That's every billion dollar publisher that releases games with initial purchase + microtransactions beyond cosmetics. So Activision/Blizzard, EA, Take Two, and Ubisoft. Like it's one thing to do free-to-play + pay-to-win but it's quite another to charge $60 and then make the game worse solely to drive people to buy things that will make it suck less. And they all do it.

hypeatei · on Oct 24, 2024

Residential IPs are extremely valuable for scraping or other automation flows so yeah getting kids to run a free game that has malware seems plausible.

codingdave · on Oct 24, 2024

How would that be a false positive? The kid might not be malicious, but they absolutely are running a bot, even if unknowingly. If anything, calling attention to it could help people notice, and therefore clean up such things.

detaro · on Oct 24, 2024

The kid isn't. But everyone else using their ISP that your ASN-based block also blocks is a false positive. An ASN block easily has a granularity of "10% of an entire large country". And nobody is going to take your site blocking e.g. all Comcast users as "oh, we should investigate which Comcast user made some slightly suspicious requests, thanks for telling us".

codingdave · on Oct 24, 2024

Fair, but we are talking about blocking just OP's site, correct? OP flagging a bot doesn't take down that ISP's access to the internet, unless I'm grossly misunderstanding the power any individual site owner has.

So is that such a bad thing? If OP is going to use this to provide data about bots, blocking mass amounts of the internet could actually be a terrific example of how many people are at least tangentially connected to bots.

pzmarzly · on Oct 24, 2024

There are some browser plugins that try to guess what technologies are used by the website you are visiting. I hope the better ones can guess it by just looking at HTML and HTTP headers, but wouldn't be surprised if others were querying some known endpoints.

etiennebausson · on Oct 24, 2024

Then they are, by definition, bots scriping the site for informations, and should start by the robots.txt

bbarnett · on Oct 24, 2024

Only someone poking about would ever hit that url on someone else's domain, so where's the downside?

And "a lot" of false positives?? Recall, robots.txt is set to ignore this, so only malicious web scanners will hit it.

poincaredisk · on Oct 24, 2024

The downside is that you ban a whole ISP because of a single user misbehaving.

Personally I sometimes do a quick request to /wp-admin to check if a site is WordPress, so I guess that has a nonzero chance of affecting me. And when I mirror a website I almost always ignore robots.txt (I'm not a robot and I do it for myself). And when I randomly open robots.txt and see a weird url I often visit it. And these are just my quirks. Not a problem for a fun website, but please don't ban a whole IP - or even whole ISP - because of this.

bbarnett · on Oct 24, 2024

Well you make a point, I use ipset in many circumstances, which has an expire option.

So that is a balance between a bad actor and even "stop it" blocks, and auto expire means transitory denial.

PeterStuer · on Oct 24, 2024

Do you own your ASN or unique IP? Do you like getting banned for the actions of others that share your ASN or IP?

str3wer · on Oct 24, 2024

what chance are we even talking of a false positive?

2000swebgeek · on Oct 24, 2024

better yet, see if bots access /robots.txt, find them from there. no human looks at robots.txt :)

add a captcha by limiting IP requests or return 429 to rate limit by IP. Using popular solutions like cloudflare could help reduce the load. Restrict by country. Alternatively, put in a login page which only solves the captcha and issues a session.

quectophoton · on Oct 24, 2024

> no human looks at robots.txt :)

I... I do... sometimes. Mostly curiosity when the thought randomly pops on my head. I mean, I know I might be flagged by the website as someone weird/unusual/suspicious, but sometimes I do it anyway.

Btw, do you know if there's any easter egg on Hacker News' own robots.txt? Because there might be.

naikrovek · on Oct 24, 2024

> no human looks at robots.txt

of course people look at this. it's not an everyday thing for the prototypical web user, but some of us look at those a lot.

pkaeding · on Oct 24, 2024

Yes, but those of us also often have trouble solving captchas.