Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
Showing results for 
Search instead for 
Did you mean: 

Serving a custom transformer class via a pyfunc wrapper for a pyspark recommendation model

New Contributor

I am trying to serve an ALS pyspark model with a custom transformer(for generating user-specific recommendations) via a pyfunc wrapper. Although I can successfully score the logged model, the serving endpoint is throwing the following error.

URI '/model/artifacts/./sparkml' does not point to the current DFS.
File '/model/artifacts/./sparkml' not found on DFS. Will attempt to upload the file.
An error occurred while loading the model. 'NoneType' object has no attribute 'jvm'.

Following is a brief summary of the code snippets used for model training, logging, and the setup for model serving:

  • Model training and logging to MLflow Model Registry with additional requirements and code paths specified.
  • Custom PySpark ML Transformer implementation for generating user-specific recommendations.
  • Python pyfunc wrapper for serving the model
  • Attempt to serve the model using Databricks' MLflow serving feature, leading to the aforementioned error

Community Manager
Community Manager

Hi @Nishat

  • Ensure that the path you’re using for the model artefacts is correctly configured and accessible within your environment.
  • Verify that the model artefacts are stored in a location accessible by the serving endpoint.
  • Double-check the path and ensure that the model artefacts are correctly uploaded or registered in the expected location.
  • Confirm that the file exists and is accessible.
  • Make sure that your environment has the necessary PySpark and JVM components set up correctly.
  • Verify the paths and file locations in your code. Ensure that the pyfunc wrapper points to the correct model artefacts.
  • Check if there are any missing dependencies or incorrect configurations related to PySpark and JVM.
  • Consider registering your custom transformer class properly to avoid serialization issues during model loading.
  • If you’re using MLflow, ensure that the model registry setup is consistent across training, logging, and serving.

If you need further assistance, feel free to ask! 🚀