Feature Store - Feature Lookup Engine with join on partial key and Filter
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-17-2022 01:30 AM
Hello ,
I am working with lookupEngine functions.
However, we have some feature tables with granularity level most detailled of dataframe input.
Please find an example :
table A with unique keys on two features : numero_p, numero_s
So while performing FeatureLookup, we want to join the customer Dataframe with table A, and obtain an output Dataframe including numero_p, numero_s.
Then, we have some customer data in a feature store table and we want to read only specific type, so filter on a feature name.
Please find an example :
table A : numero_p, typology, datdeb, datfin
Now we want to filter on column "typology".
So while performing FeatureLookup, we wish pass some filter to extract only customer from on one typology 'A'.
The lookup does not allow you to do these 2 functions for this time.
Will these functions be possible in the future?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-17-2022 11:21 PM
Hi @SERET Nathalie , I can check internally on the ask here.
In the meantime please let us know if this helps: https://docs.databricks.com/machine-learning/feature-store/feature-tables.html
https://docs.databricks.com/machine-learning/feature-store/index.html#
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-18-2022 07:05 AM
Hi Debayan,
In the documentation, I see a paragraph "Create a TrainingSet when lookup keys do not match the primary keys".
The join must be done on all the columns of the primary key, we indicate the columns of the dataframe respecting the order of columns in primary keys, but we cannot do a partial join on the first column otherwise I get an error message.
To do this join, I use the read_table function to read the data from the feature store table and enrich the input dataframe before lookup engine.
But it's not satisfying.
Best regards

