Custom transformers with mlflow
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2024 03:27 PM
Hi Everyone,
I have created a spark pipeline in which I have a stage which is a Custom Transformer. Now I am using feature stores to log my model. But the issue is that the custom Transformer stage is not serialized properly and is not logged along with the whole pipeline. I have logged the pipeline using MLFLOW along with logging the custom transformer as an artifact. Then in inference environment I loaded the model's artifact first and saved it in a temporary path and added that path to the sys paths making that custom code available for the model to do inference successfully. But I can not do this when I log my model using feature stores. Also, this method of logging the custom code as an artifact and loading it before the inference is not reliable and kills the overall idea of end to end pipeline.
Any help in this regard would be highly appreciated.
Thanks in advance!
- Labels:
-
Spark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2024 06:37 AM
Hi @Retired_mod , Can you please guide me what are the additional steps I'll need to handle serialization of Custom transformers so I can use it in my model pipeline via feature stores.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-06-2025 10:45 AM
Hi @NaeemS,
Did you ever get a solution to this problem? I've now encountered this myself. When I save the pipeline using ML Flow log_model, I am able to load the model fine. When I log it with Databricks Feature Engineering package, it throws an error stating that the custom class does not exist.
Regards

