If you ban an IP or even an ASN, there could be (many) thousands sharing that same identifier. Some kid will unknowingly run some free game that does some lightweight scraping in the background as monetization and you ban the whole ISP?
For some definition of "common", yes. Some try to be less shady by asking for consent (e.g. in exchange for in-game credits), others are essentially malware.
FYI this is a rebranding of the notorious “Luminati” service that sold a residential proxy network based on the ”Hola VPN” chrome extension. They’ve upped their game and now pay application developers to embed their botnet in their application.
the idea that games should be written solely to extract revenue from players is so repulsive to me that I actively disrespect and disfavor people I know who work on things like this.
humans are a truly horrible species and this kind of thing is a great example of why I believe that.
That's every billion dollar publisher that releases games with initial purchase + microtransactions beyond cosmetics. So Activision/Blizzard, EA, Take Two, and Ubisoft. Like it's one thing to do free-to-play + pay-to-win but it's quite another to charge $60 and then make the game worse solely to drive people to buy things that will make it suck less. And they all do it.
Residential IPs are extremely valuable for scraping or other automation flows so yeah getting kids to run a free game that has malware seems plausible.
How would that be a false positive? The kid might not be malicious, but they absolutely are running a bot, even if unknowingly. If anything, calling attention to it could help people notice, and therefore clean up such things.
The kid isn't. But everyone else using their ISP that your ASN-based block also blocks is a false positive. An ASN block easily has a granularity of "10% of an entire large country". And nobody is going to take your site blocking e.g. all Comcast users as "oh, we should investigate which Comcast user made some slightly suspicious requests, thanks for telling us".
Fair, but we are talking about blocking just OP's site, correct? OP flagging a bot doesn't take down that ISP's access to the internet, unless I'm grossly misunderstanding the power any individual site owner has.
So is that such a bad thing? If OP is going to use this to provide data about bots, blocking mass amounts of the internet could actually be a terrific example of how many people are at least tangentially connected to bots.
There are some browser plugins that try to guess what technologies are used by the website you are visiting. I hope the better ones can guess it by just looking at HTML and HTTP headers, but wouldn't be surprised if others were querying some known endpoints.
The downside is that you ban a whole ISP because of a single user misbehaving.
Personally I sometimes do a quick request to /wp-admin to check if a site is WordPress, so I guess that has a nonzero chance of affecting me. And when I mirror a website I almost always ignore robots.txt (I'm not a robot and I do it for myself). And when I randomly open robots.txt and see a weird url I often visit it. And these are just my quirks. Not a problem for a fun website, but please don't ban a whole IP - or even whole ISP - because of this.
better yet, see if bots access /robots.txt, find them from there. no human looks at robots.txt :)
add a captcha by limiting IP requests or return 429 to rate limit by IP. Using popular solutions like cloudflare could help reduce the load. Restrict by country. Alternatively, put in a login page which only solves the captcha and issues a session.
I... I do... sometimes. Mostly curiosity when the thought randomly pops on my head. I mean, I know I might be flagged by the website as someone weird/unusual/suspicious, but sometimes I do it anyway.
Btw, do you know if there's any easter egg on Hacker News' own robots.txt? Because there might be.
You are going to hit a lot more false positives with this one than actual bots