Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The AI isn't "reading the web" though, they are reading the top hits on the search results, and are free-riding on the access that Google/Bing gets in order to provide actual user traffic to their sites. Many webmasters specifically opt their pages out of being in the search results (via robots.txt and/or "noindex" directives) when they believe the cost/benefit of the bot traffic isn't worth the user traffic they may get from being in the search results.

One of my websites that gets a decent amount of traffic has pretty close to a 1-1 ratio of Googlebot accesses compared to real user traffic referred from Google. As a webmaster I'm happy with this and continue to allow Google to access the site.

If ChatGPT is giving my website a ratio of 100 bot accesses (or more) compared to 1 actual user sent to my site, I very much should have to right to decline their access.



> If ChatGPT is giving my website a ratio of 100 bot accesses (or more) compared to 1 actual user sent to my site

are you trying to collect ad revenue from the actual users? otherwise a chatbot reading your page because it found it by searching google and then relaying the info, with a link, to the user who asked for it seems reasonable


While yes, I am attempting to collect ad revenue from users, and yes, I don't want somebody competing with me and cutting me out the loop, a large part of it is controlling my content. I'm not arguing whether the AI chatbot has the legal right to access the page, I'm not a legal scholar. What I'm saying is that the leading search engines also have the equal rights to access whatever content they want, and yet they all give webmasters the following tools:

- Ability to prevent their crawlers from accessing URLs via robots.txt

- Ability to prevent a page from being indexed on the internet (noindex tag)

- Ability to remove existing pages that you don't want indexed (webmaster tools)

- Ability to remove an entire domain from the search engine (webmaster tools)

It is really impolite for the AI chatbots to go around and flout all these existing conventions because they know that webmasters would restrict their access because it's much less beneficial than it is for existing search engines.

In the long run, all this is going to lead to is more anti-bot countermeasures, more content behind logins (which can have legally binding anti-AI access restrictions) and less new original content. The victim will be all humans who aren't using a chatbot to slightly benefit the ones who are.

And again, I'm not suggesting that AI chatbots should not be allowed to load webpages, just that webmasters should be able to opt out of it.


> While yes, I am attempting to collect ad revenue from users, and yes, I don't want somebody competing with me and cutting me out the loop, a large part of it is controlling my content.

> It is really impolite for the AI chatbots to go around and flout all these existing conventions because they know that webmasters would restrict their access because it's much less beneficial than it is for existing search engines.

I agree with you about the long run effects on the internet at large, but I still don't understand the horse you have in it personally. I read you as saying (1) it's less about ad revenue than content control, but (2) content control is based on analysis of benefits, i.e. ad revenue?


Well you have no rights when you expose a server to the internet. Other than copyright of course.


> Well you have no rights when you expose a server to the internet.

Technically you don’t, but there are still laws that affect what you can legally do when accessing the web. Beyond the copyright issues that have been outlined by people a lot more qualified than me, I think you could also make the point that AI crawlers actively cause direct and indirect financial harm.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: