Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

sounds very similar to https://docs.pytorch.org/docs/stable/distributed.fsdp.fully_... i wonder how much this could be replicated using only this pytorch primitive
 help



Check out Fig. 6 in this paper, it shows the comparison between the proposed method and pytorch native FSDP offload method.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: