cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Handling Null Values in Feature Stores

NaeemS
New Contributor III

Hi, I am using multiple feature stores in my workflow using feature lookups. In my logged pipeline, I have several stages, including Assembler, Standard Scaler, Indexer and then Model. However, I am facing an issue during inference using the `score batch` function.

If any such identifier exists which does not have all the pre-computed values in feature stores, the join operation based on feature lookups will assign a null value, and then that null value will be passed directly to the model in the `score batch` function. Is there any way to handle this? I have tried the following methods until now:

  • Defining an initial stage of custom transformer in my pipeline to handle such columns. But in order to use it properly I will have to log this additional code along with my model. This can be done with Mlflow using the code_path parameter, but the feature store `log_model` method does not provide this parameter. 
  • Feature store provides a FeatureFunction method to calculate on demand features, but this method is used for adding additional columns to our resultant dataframe. Can we leverage this method to handle null values of some columns by defining logic in the functions to replace them with nulls?

 

Thanks.

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group