Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

At what quantization? And if it is in fact quantized below fp8, how is the performance impacted on all the various benchmarks?


They claim they don't use quantization.

The reason for their speed is this chip: https://www.cerebras.ai/chip




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: