Databricks Community

Jaeseon · ‎06-02-2023

I'm currently immersed in a project where I'm leveraging PyTorch to develop an object detection model using satellite imagery. My immediate objective is to perform distributed training on this model using PySpark.

While I have found several tutorials and examples on image classification, I'm having trouble translating these resources to suit my needs.

Specifically, to the best of my knowledge, I think that I need to load images and annotation files using PySpark, and subsequently convert or transform these files into a format that's compatible with PyTorch for the purpose of object detection model building. I'm eagerly seeking advice or any pointers towards helpful tutorials or examples that can aid me in refining and constructing my model.

During my search, I have stumbled upon some resources related to using SparkTorch

or pyspark.ml.torch.distributor, and horovod. However, I've been encountering

difficulties in successfully installing horovod. I appreciate any guidance on

this issue as well.

sean_owen · ‎06-02-2023

Have you seen https://docs.databricks.com/machine-learning/train-model/distributed-training/spark-pytorch-distribu... ? You don't have to install Horovod, it's already in the runtime. Yes you read images however you want, and parse them into pixels which are just arrays, thus tensors.

View solution in original post

sean_owen · ‎06-02-2023

Have you seen https://docs.databricks.com/machine-learning/train-model/distributed-training/spark-pytorch-distribu... ? You don't have to install Horovod, it's already in the runtime. Yes you read images however you want, and parse them into pixels which are just arrays, thus tensors.

Jaeseon · ‎06-05-2023

I encountered an issue while working on my school's Linux server where an older version of PySpark was being used, preventing me from utilizing a specific module from a provided link. As an alternative, I attempted to install and import Ray in my Jupyter Notebook on the server. Although I successfully installed the package

ray==2.4.0

, I faced an error when trying to import it, which displayed the message "no module named ray.__raylet".

On my local machine, I installed the same version of Ray, and it was successfully installed and imported without any issues.

I would appreciate guidance on how to resolve this problem in the context of my school's Linux server.

Anonymous · ‎06-14-2023

Hi @Jaeseon Song

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!

Databricks Community

Distributed training on building object detection model on PyTorch and PySpark.

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences