Having worked on bot detection in the past. Some really simple old fashioned att...

EPendragon · 2025-07-17T15:26:11 1752765971

That was the first thing that I have learnt about the robots.txt file. Even RFC 9309 Robots Exclusion Protocol document: https://www.rfc-editor.org/rfc/rfc9309.html - mentions:

> These rules are not a form of access authorization.

Meaning that these are not enforced in any way. They cannot prevent you from accessing anything really.

I think the only approach that could work in this scenario would be to find which companies disregard the robots.txt, and bring it to the attention of technical community. Practices like these could make the company look shady and untrustworthy if found out. That could be one way to keep them accountable, even though there is still no guarantee they will abide by it.