Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's trained on English corpus exclusively AFAIK.

As for whether it is ready for prime-time, it is an "Alpha" of an uncompleted training run. So it's not finished cooking.

Also, that is the 7B model. They're cooking 15B, 30B, and 65B right now and planning to start 175B soon.

For comparison, 15B is already larger than GPT-3.5 (which is likely a finetune of Curie 13B) while 175B is the same as full size GPT-3 v1 175B which 13B LLaMA already beat on benchmarks. So we can expect all four models larger than 7B to be better than GPT-3 when they are done training (at least in English).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: