Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Very different, it would seem. Then again, it’s never been clear to me why LeCun believes that LLM architectures don’t inherently produce world models in the course of training.


Nor I.

IMO LLM more or less literally cannot do what they do without a world model, not least because much of what language is, is a protocol for making assertions about that model, testing the degree to which it is shared, and seeking to alter the model one carries of one's interlocutor's model.

To the "parrot people" I suggest, there is no more optimized mechanism for the inner layers of a network to approach than one which most parsimoniously models the world, so as to correctly emit tokens reflective of that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: