How to utilize clustered gpu for large hf models
Hi,I am using clustered GPU(driver -1GPU and Worker-3GPU), and caching model data into unity catalog but while loading model checkpoint shards its always use driver memory and failed due insufficient memory.How to use complete cluster GPU while loadi...
- 484 Views
- 1 replies
- 0 kudos
Latest Reply
1. Are you using any of the model parallel library, such as FSDP or DeepSpeed? Otherwise, every GPU will load the entire model weights. 2. If yes in 1, Unity Catalog Volumes are exposed on every node at /Volumes/<catalog>/<schema>/<volume>/..., so w...
- 0 kudos