cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to read feature table without target_df / online inference based on filter_condition in databricks feature store

AmithAdiraju16
New Contributor II

I'm using databricks feature store == 0.6.1.

After I register my feature table with `create_feature_table` and write data with `write_Table` I want to read that feature_table based on filter conditions ( may be on time stamp column ) without calling `create_training_set` would like to this for both training and batch inference.

I found `read_table` function to accomplish this, but not sure how to provide filter conditions in its function call.

Ideally, I'd also like to read a single feature row from online store as well, by passing some entity keys; I couldn't find any documentation for reads from offline and online store, related to my use case.

Any help is much appreciated. Thanks.

4 REPLIES 4

Hubert-Dudek
Esteemed Contributor III

create_training_set is just a simple Select from delta tables. All feature tables are just registered delta tables. Here is an example code that I used to handle that:

    customer_features_df = spark.sql("SELECT * FROM recommender_system.customer_features")
    product_features_df = spark.sql("SELECT * FROM recommender_system.product_features")
     
    training_df.join(
      customer_features_df,
      on=[training_df.cid == customer_features_df.customer_id,
          training_df.transaction_dt == customer_features_df.dt],
      how="inner"
    ).join(
      product_features_df,
      on="product_id",
      how="inner"
    )

Thanks Hubert. So you mean to say, if I want to read a feature table separately , I just do regular select sql statement on that feature table as if a normal delta table ?

`read_table` is not needed in this case ?

Hubert-Dudek
Esteemed Contributor III

yes

Along similar lines, I'm struggling to understand one concept on feature tables here.

If I can read a feature table directly through sql logic and filter it to the dates of my choice, then how's data bricks feature store different from a "data mart " which is in time separated way ?

Similarly, with feature versioning , every time I want to read a different set of features from offline store, I just pass different column names. How's that different from a regular "select" statement in SQL and data frame ?

I'm struggling to justify value of using data bricks feature store to my team, when they say , "its another data mart ". I have intuition that it's not, but can't give proper reasoning.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.