Machine Learning

by Jaeseon • New Contributor II

06-02-2023 12:35:16 PM

4359 Views
3 replies
3 kudos

Resolved! Distributed training on building object detection model on PyTorch and PySpark.

I'm currently immersed in a project where I'm leveraging PyTorch to develop an object detection model using satellite imagery. My immediate objective is to perform distributed training on this model using PySpark. While I have found several tutorials...

Machine Learning

Reply

4359 Views
3 replies
3 kudos

06-02-2023 12:35:16 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 12:11:14 AM

3 kudos

Hi @Jaeseon Song Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

3 kudos

06-14-2023 12:11:14 AM

2 More Replies

by ptawil • New Contributor III

07-07-2022 8:49:48 AM

3374 Views
2 replies
4 kudos

Runtime error using MLFlow and Spark on databricks

Here is some model I created:class SomeModel(mlflow.pyfunc.PythonModel): def predict(self, context, input): # do fancy ML stuff # log results pandas_df = pd.DataFrame(...insert predictions here...) spark_df = spark...

Machine Learning

Reply

3374 Views
2 replies
4 kudos

07-07-2022 8:49:48 AM

View Replies

Latest Reply

Nikhil3107
New Contributor III

06-07-2023 8:08:01 AM

4 kudos

Any updates on this? I am running into the same issue@Patrick Tawil were you able to solve this problem? If so, do you mind sharing?

4 kudos

06-07-2023 8:08:01 AM

1 More Replies

by ryojikn • New Contributor III

01-15-2023 8:26:07 PM

2016 Views
1 replies
1 kudos

Error on pandas udf usage in databricks, sc.broadcasting random forest loaded from Kedro MLFlow Logger DataSet, cannot pickle '_thread.RLock' object

I'm trying to broadcast a Random forest (sklearn 1.2.0) recently loaded from mlflow, and using Pandas UDF to predict a model.However, the same code works perfectly on Spark 2.4 + our OnPrem cluster.I thought it was due to Spark 2.4 to 3 changes, an...

Machine Learning

Reply

2016 Views
1 replies
1 kudos

01-15-2023 8:26:07 PM

View Replies

Latest Reply

ryojikn
New Contributor III

01-30-2023 5:03:31 AM

1 kudos

Anyone?

1 kudos

01-30-2023 5:03:31 AM

by weldermartins • Honored Contributor

10-08-2022 6:20:27 AM

21487 Views
17 replies
13 kudos

Resolved! Created nested struct schema SPARK - Schema Jira

Hello guys,I'm using Jira API to return "ISSUES". But to be able to use pyspark I need to create the Dataframe passing in the Schema. But I am not able to create the Schema based on the model below. Would you have any ideas?root |-- expand: string ...

Machine Learning

Reply

21487 Views
17 replies
13 kudos

10-08-2022 6:20:27 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

10-11-2022 1:21:32 AM

13 kudos

if columns are missing, that particular data is not present in the json. I am not aware of spark skipping columns when reading json with inferschema. There is an option dropFieldIfAllNull but that is False by default.That makes me think: you might ...

13 kudos

10-11-2022 1:21:32 AM

16 More Replies

by jnjns • New Contributor II

07-27-2022 6:23:08 AM

1253 Views
0 replies
3 kudos

Java Error for installation rasterframes

Hi all,I have followed the steps in this notebook to install rasterframes on my databricks cluster.Eventually I am able to import the following:from pyrasterframes import rf_ipython from pyrasterframes.utils import create_rf_spark_session from pyspar...

Machine Learning

Reply

1253 Views
0 replies
3 kudos

07-27-2022 6:23:08 AM

by User16826994223 • Honored Contributor III

06-09-2021 3:03:28 AM

2172 Views
1 replies
0 kudos

Muliple Where condition vs AND && in Pyspark

.where((col('state')==state) & (col('month')>startmonth)I can do the where conditions both ways. I think the one below add readability. Is there any other difference and which is the best?.where(col('state')==state).where(col('month')>startmonth)

Machine Learning

Reply

2172 Views
1 replies
0 kudos

06-09-2021 3:03:28 AM

View Replies

Latest Reply

User16826994223
Honored Contributor III

06-09-2021 3:04:39 AM

0 kudos

You can use explain to see what type of physical and logical plans are getting created . This is the best way to see difference , but as mentioned in the question , it should give the same physical plan

0 kudos

06-09-2021 3:04:39 AM

Databricks Community

Forum Posts

Resolved! Distributed training on building object detection model on PyTorch and PySpark.

Runtime error using MLFlow and Spark on databricks

Error on pandas udf usage in databricks, sc.broadcasting random forest loaded from Kedro MLFlow Logger DataSet, cannot pickle '_thread.RLock' object

Resolved! Created nested struct schema SPARK - Schema Jira

Java Error for installation rasterframes

Muliple Where condition vs AND && in Pyspark