cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Jaeseon
by New Contributor II
  • 1486 Views
  • 3 replies
  • 3 kudos

Resolved! Distributed training on building object detection model on PyTorch and PySpark.

I'm currently immersed in a project where I'm leveraging PyTorch to develop an object detection model using satellite imagery. My immediate objective is to perform distributed training on this model using PySpark. While I have found several tutorials...

  • 1486 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Jaeseon Song​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 3 kudos
2 More Replies
ptawil
by New Contributor III
  • 1924 Views
  • 2 replies
  • 4 kudos

Runtime error using MLFlow and Spark on databricks

Here is some model I created:class SomeModel(mlflow.pyfunc.PythonModel): def predict(self, context, input): # do fancy ML stuff # log results pandas_df = pd.DataFrame(...insert predictions here...) spark_df = spark...

  • 1924 Views
  • 2 replies
  • 4 kudos
Latest Reply
Nikhil3107
New Contributor III
  • 4 kudos

Any updates on this? I am running into the same issue@Patrick Tawil​ were you able to solve this problem? If so, do you mind sharing?

  • 4 kudos
1 More Replies
ryojikn
by New Contributor III
  • 972 Views
  • 1 replies
  • 1 kudos

Error on pandas udf usage in databricks, sc.broadcasting random forest loaded from Kedro MLFlow Logger DataSet, cannot pickle '_thread.RLock' object

I'm trying to broadcast a Random forest (sklearn 1.2.0) recently loaded from mlflow, and using Pandas UDF to predict a model.​However, the same code works perfectly on Spark 2.4 + our OnPrem cluster.​I thought it was due to Spark 2.4 to 3 changes, an...

  • 972 Views
  • 1 replies
  • 1 kudos
Latest Reply
ryojikn
New Contributor III
  • 1 kudos

Anyone?

  • 1 kudos
weldermartins
by Honored Contributor
  • 6366 Views
  • 17 replies
  • 13 kudos

Resolved! Created nested struct schema SPARK - Schema Jira

Hello guys,I'm using Jira API to return "ISSUES". But to be able to use pyspark I need to create the Dataframe passing in the Schema. But I am not able to create the Schema based on the model below. Would you have any ideas?root |-- expand: string ...

  • 6366 Views
  • 17 replies
  • 13 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 13 kudos

if columns are missing, that particular data is not present in the json. I am not aware of spark skipping columns when reading json with inferschema. There is an option dropFieldIfAllNull but that is False by default.That makes me think: you might ...

  • 13 kudos
16 More Replies
jnjns
by New Contributor II
  • 642 Views
  • 0 replies
  • 3 kudos

Java Error for installation rasterframes

Hi all,I have followed the steps in this notebook to install rasterframes on my databricks cluster.Eventually I am able to import the following:from pyrasterframes import rf_ipython from pyrasterframes.utils import create_rf_spark_session from pyspar...

  • 642 Views
  • 0 replies
  • 3 kudos
User16826994223
by Honored Contributor III
  • 1275 Views
  • 1 replies
  • 0 kudos

Muliple Where condition vs AND && in Pyspark

.where((col('state')==state) & (col('month')>startmonth)I can do the where conditions both ways. I think the one below add readability. Is there any other difference and which is the best?.where(col('state')==state).where(col('month')>startmonth)

  • 1275 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

You can use explain to see what type of physical and logical plans are getting created . This is the best way to see difference , but as mentioned in the question , it should give the same physical plan

  • 0 kudos
Labels