cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Distributed training on building object detection model on PyTorch and PySpark.

Jaeseon
New Contributor II

I'm currently immersed in a project where I'm leveraging PyTorch to develop an object detection model using satellite imagery. My immediate objective is to perform distributed training on this model using PySpark.

While I have found several tutorials and examples on image classification, I'm having trouble translating these resources to suit my needs.

Specifically, to the best of my knowledge,  I think that I need to load images and annotation files using PySpark, and subsequently convert or transform these files into a format that's compatible with PyTorch for the purpose of object detection model building. I'm eagerly seeking advice or any pointers towards helpful tutorials or examples that can aid me in refining and constructing my model.

During my search, I have stumbled upon some resources related to using SparkTorch

or pyspark.ml.torch.distributor, and horovod. However, I've been encountering

difficulties in successfully installing horovod. I appreciate any guidance on

this issue as well.

1 ACCEPTED SOLUTION

Accepted Solutions

sean_owen
Honored Contributor II
Honored Contributor II

Have you seen https://docs.databricks.com/machine-learning/train-model/distributed-training/spark-pytorch-distribu... ? You don't have to install Horovod, it's already in the runtime. Yes you read images however you want, and parse them into pixels which are just arrays, thus tensors.

View solution in original post

3 REPLIES 3

sean_owen
Honored Contributor II
Honored Contributor II

Have you seen https://docs.databricks.com/machine-learning/train-model/distributed-training/spark-pytorch-distribu... ? You don't have to install Horovod, it's already in the runtime. Yes you read images however you want, and parse them into pixels which are just arrays, thus tensors.

Jaeseon
New Contributor II

I encountered an issue while working on my school's Linux server where an older version of PySpark was being used, preventing me from utilizing a specific module from a provided link. As an alternative, I attempted to install and import Ray in my Jupyter Notebook on the server. Although I successfully installed the package

ray==2.4.0

, I faced an error when trying to import it, which displayed the message "no module named ray.__raylet".

On my local machine, I installed the same version of Ray, and it was successfully installed and imported without any issues.

I would appreciate guidance on how to resolve this problem in the context of my school's Linux server.

Anonymous
Not applicable

Hi @Jaeseon Song​ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.