cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to read feature table without target_df / online inference based on filter_condition in databricks feature store

AmithAdiraju16
New Contributor II

I'm using databricks feature store == 0.6.1.

After I register my feature table with `create_feature_table` and write data with `write_Table` I want to read that feature_table based on filter conditions ( may be on time stamp column ) without calling `create_training_set` would like to this for both training and batch inference.

I found `read_table` function to accomplish this, but not sure how to provide filter conditions in its function call.

Ideally, I'd also like to read a single feature row from online store as well, by passing some entity keys; I couldn't find any documentation for reads from offline and online store, related to my use case.

Any help is much appreciated. Thanks.

4 REPLIES 4

Hubert-Dudek
Esteemed Contributor III

create_training_set is just a simple Select from delta tables. All feature tables are just registered delta tables. Here is an example code that I used to handle that:

    customer_features_df = spark.sql("SELECT * FROM recommender_system.customer_features")
    product_features_df = spark.sql("SELECT * FROM recommender_system.product_features")
     
    training_df.join(
      customer_features_df,
      on=[training_df.cid == customer_features_df.customer_id,
          training_df.transaction_dt == customer_features_df.dt],
      how="inner"
    ).join(
      product_features_df,
      on="product_id",
      how="inner"
    )

Thanks Hubert. So you mean to say, if I want to read a feature table separately , I just do regular select sql statement on that feature table as if a normal delta table ?

`read_table` is not needed in this case ?

Hubert-Dudek
Esteemed Contributor III

yes

Along similar lines, I'm struggling to understand one concept on feature tables here.

If I can read a feature table directly through sql logic and filter it to the dates of my choice, then how's data bricks feature store different from a "data mart " which is in time separated way ?

Similarly, with feature versioning , every time I want to read a different set of features from offline store, I just pass different column names. How's that different from a regular "select" statement in SQL and data frame ?

I'm struggling to justify value of using data bricks feature store to my team, when they say , "its another data mart ". I have intuition that it's not, but can't give proper reasoning.