cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

dbfs not found

byrnesy5
New Contributor II

Hi, 

I've saved a custom pyfunc and now I'm trying to load it in a pandas_udf. It works on small samples or if I repartition everything to 1 partition, but when I try to run it on a larger sample and distribute it across my cluster it fails repeatably and says the model in the dbfs cannot be found. 

Any ideas?

Thanks,

Andrew

 

3 REPLIES 3

byrnesy5
New Contributor II

Here you can see the progress. 15 tasks succeeded and 110 have failed (eventually the job will fail completely).  Seems like some workers can see the file and others cannot. Not sure why some would fail and others would succeed in this instance. 

byrnesy5_0-1734453715975.png

 

Alberto_Umana
Databricks Employee
Databricks Employee

This problem can often be attributed to the model artifacts not being available on all the executors, especially in a distributed environment.

Can you try using the dbutils.fs.refreshMounts() in your code?

If the model is small enough, broadcast it to all executors using sc.broadcast

Thanks for getting back to me. Didn't have any luck though. 

I tried dbutils.fs.refreshMounts() and still getting the same errors. 

I tried broadcasting the model, but it's not able to be pickled and broadcast.

Any other ideas?