Ten hours is a decent amount of time, so I'm not too surprised the human won. LLMs don't really tend to improve the longer they get to chew on a problem (often the opposite in fact).
The LLM was probably getting nowhere trying to improve after the first few minutes.
On the livestream (perhaps elsewhere?) you can watch the submissions and scores come in over time. The LLM steadily increased (and sometimes decreased) it's score over time though by the end did seem to hit a lacuna. You could even see it try out new strategies (with walls e.g.) which didn't appear until about half-way through the competition.
The LLM was probably getting nowhere trying to improve after the first few minutes.