Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I use OpenMPI with no issues over multiple H100 nodes and A100 nodes, with multiple infiniband 200G and ethernet 100G/200G networks, and RDMA (though using mellanox instead of broadcom cards, but afaik broadcom supports this just the same). Side note, make sure you compile nvidia_peermem correctly if you want GDRMA to work :)


No issues, except this minor bit of arcane knowledge that is missing from SO. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: