Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

When the first PC with Basic launched in the 80s many people wanted to develop for it.

When the iPhone Appstore launched, many people started to build apps in the ecosystem.

While it might be it bit too early to compare RL to those advances in technology. I personally feel there is huge potential. I might be wrong though. And I am fine with that.



RL needs a supercomputer and its code is usually too fragile - making a trivial mistake anywhere (missing a constant multiplication, swapping the order of two consecutive lines of code etc.) would likely lead to your model never converging even if you got everything else right.


The hard part of RL for the problems I've encountered in my work is that you need a simulator. Building a reliable and accurate simulator is often an immense undertaking.


Maybe data scientists should team up (more?) with game programmers. They have a ton of experience in building very complex simulations.


Which code is not fragile in that sense? I think that is a rather strange criticism.


You can do RL on an raspberry pi. Depends what problem you are trying to solve but not all of them require video analysis and billions of parameters.


Technical point: Value functions that are a constant multiples of each other result in the same behavior.


Making a constant multiplication mistake somewhere in the code doesn't imply the new value function would be a constant multiply of the optimal one.


RL isn't new though, the foundational results are about 25 years old.


And it feels a bit like it is stalling (at least in continuous control)


In my opinion there's a wide open array of approaches from control that can help with this. Learning for Control is a new conference that looks at this very topic.


No one said "new". You can apply what you said to PC and iPhones. Mainframes and palms existed before them.


That's still very analogous to the first PCs. By that point there had been decades of foundational computer work




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: