Databricks Community

ptawil · ‎07-07-2022

Here is some model I created:

class SomeModel(mlflow.pyfunc.PythonModel):
    def predict(self, context, input):
        # do fancy ML stuff
        # log results
        pandas_df = pd.DataFrame(...insert predictions here...)
        spark_df = spark.createDataFrame(pandas_df)
        spark_df.write.saveAsTable('tablename', mode='append')

I'm trying to log my model in this manner by calling it later in my code:

with mlflow.start_run(run_name="SomeModel_run"):
    model = SomeModel()
    mlflow.pyfunc.log_model("somemodel", python_model=model)

Unfortunately it gives me this Error Message:

RuntimeError: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.

The error is caused because of the line

mlflow.pyfunc.log_model("somemodel", python_model=model)

If I comment it out my model will make its predictions and log the results in my table.

Alternatively, removing the lines in my predict function where I call spark to create a dataframe and save the table, I am able to log my model.

How do I go about resolving this issue? I need my model to not only write to the table but also be logged

Aviral-Bhardwaj · ‎12-17-2022

this sis something new we have to explore this , do you have any docs that you re following here

AviralBhardwaj

Nikhil3107 · ‎06-07-2023

Any updates on this? I am running into the same issue

@Patrick Tawil were you able to solve this problem? If so, do you mind sharing?

Databricks Community

Runtime error using MLFlow and Spark on databricks

Photos

Join Us as a Local Community Builder!

Exciting Opportunity to Collaborate with Us!

Intelligent Data Warehousing: AI/BI for Self-service Analytics

Share Your Thoughts on Databricks & Get Rewarded!

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April