cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Custom transformers with mlflow

NaeemS
New Contributor III

Hi Everyone,

I have created a spark pipeline in which I have a stage which is a Custom Transformer. Now I am using feature stores to log my model. But the issue is that the custom Transformer stage is not serialized properly and is not logged along with the whole pipeline. I have logged the pipeline using MLFLOW along with logging the custom transformer as an artifact. Then in inference environment I loaded the model's artifact first and saved it in a temporary path and added that path to the sys paths making that custom code available for the model to do inference successfully. But I can not do this when I log my model using feature stores. Also, this method of logging the custom code as an artifact and loading it before the inference is not reliable and kills the overall idea of end to end pipeline. 

Any help in this regard would be highly appreciated.

Thanks in advance!

 

2 REPLIES 2

NaeemS
New Contributor III

Hi @Retired_mod , Can you please guide me what are the additional steps I'll need to handle serialization of Custom transformers so I can use it in my model pipeline via feature stores.

Thanks!

WarrenO
New Contributor III

Hi @NaeemS,

Did you ever get a solution to this problem? I've now encountered this myself. When I save the pipeline using ML Flow log_model, I am able to load the model fine. When I log it with Databricks Feature Engineering package, it throws an error stating that the custom class does not exist.

 

Regards