Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for the fantastic explanation!

Would it be more efficient to calculate some kind of per-model or per-layer mean, and then only specify standard deviations, maybe by fp8 or smaller?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: