Then you did not read my post carefully enough. The question was not about "internal state" but "inner workings". The model clearly does something. The problem is that we don't know how to describe in human terms what happens between the matrix multiplication and the words it spits out. Whether it has state is completely irrelevant.