> So I think it's overly dismissive to regard LLMs as mere surface statistics. I...

mitthrowaway2 · on Dec 12, 2024

It's what they input and output, but it's not literally what they are. The only way to squeeze that many statistics into a compact model is to curve-fit an approximation of the generating process itself. While it fits stochastic sequences (of any type, but usually text), it's conceptually no different from any other ML model. It's no more surface statistics than a deep neural network trained for machine vision would be.