Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>No. I’m suggesting they won’t because IP like the mellanox treasure chest they acquired is ridiculously difficult to develop and Nvidia has aggressively exploited it, along with their other already advanced IP in the space of their -core business-.

For training Llama3 Facebook set up two clusters, one using fancy InfiniBand and one just using RoCE over Arista cards: https://engineering.fb.com/2024/03/12/data-center-engineerin... . The latter ended up doing fine, suggesting that all that Mellanox stuff isn't necessary for large-scale training (apparently at a large enough scale ethernet scales better than InfiniBand).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: