Resolved! Does Databricks supports the Pytorch Distributed Training for multiple devices?
Hi, Im trying to use the databricks platform to do the pytorch distributed training, but I didnt find any info about this. What I expected is using multiple clusters to run a common job using pytorch distributed data parallel (DDP) with the code belo...
- 2242 Views
- 3 replies
- 1 kudos
Latest Reply
With Databricks MLR, HorovodRunner is provided which supports distributed training and inference with PyTorch. Here's an example notebook for your reference: PyTorchDistributedDeepLearningTraining - Databricks.
- 1 kudos