Quinten
Databricks Partner

I'm facing the same issue as described by @mrcity. There is no easy way to alter the dataframe, which is created inside the score_batch() function. Filtering out rows in the (sklearn) pipeline itself is also not convenient since these transformers are typically focused on the features.

The solution described here is quite clean, but goes against the idea of the 'feature-aware' batch inference provided by the FeatureStoreClient(). It is a workable work-around, but i.m.o. one should not need to provide the exact feature tables to do this filtering. It would be much better if the score_batch() would have the option to drop the NULL values from the dataframe before running the model on it.

If there are any other suggestions, I would like to hear it too.