they must offer distributed storage, that can accommodate massive models, though? how else would you have multiple GPUs working on a single training model?
they must offer distributed storage, that can accommodate massive models, though? how else would you have multiple GPUs working on a single training model?