It reminds me of some talk I had the chance to come across, I can’t remember which one. Sleep could be a training phase. Discerning whether inputs come from the real world or from the trained neural network itself would be the reward function. When it’s no longer possible to discerne reality from "fiction", the training is complete. And this could be somehow linked to the ability of training while simultaneously operating the network. I wish I remembered where I heard about it.
Edit: I think this was it: https://www.youtube.com/watch?v=FBkhbqrFyo4