Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ten hours is a decent amount of time, so I'm not too surprised the human won. LLMs don't really tend to improve the longer they get to chew on a problem (often the opposite in fact).

The LLM was probably getting nowhere trying to improve after the first few minutes.



On the livestream (perhaps elsewhere?) you can watch the submissions and scores come in over time. The LLM steadily increased (and sometimes decreased) it's score over time though by the end did seem to hit a lacuna. You could even see it try out new strategies (with walls e.g.) which didn't appear until about half-way through the competition.


> The LLM was probably getting nowhere trying to improve after the first few minutes.

How did you come to that conclusion from the contents of the article?

The final scores are all relatively close. How could that happen if the ai was floundering the whole time? Just a good initial guess?


>How could that happen if the ai was floundering the whole time? Just a good initial guess?

Yes, that and marginal improvements over it.


Yeap, self-reinforcement learning is missing in LLMs.


I would think the LLM though is not trying one solution for 10 hours like a human.

I would assume the LLM is trying an inhuman number of solutions and the best one was #2 in this contest.

Impressive by the human winner but good luck on that in 2026.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: