Data Engineering

Forum Posts

Sorted by:

by mrcity • New Contributor II

02-06-2023 2:35:38 PM

2692 Views
3 replies
1 kudos

Exclude absent lookup keys from dataframes made by create_training_set()

I've got data stored in feature tables, plus in a data lake. The feature tables are expected to lag the data lake by at least a little bit. I want to filter data coming out of the feature store by querying the data lake for lookup keys out of my inde...

Data Engineering

2692 Views
3 replies
1 kudos

02-06-2023 2:35:38 PM

View Replies

Latest Reply

Quinten
New Contributor II

08-14-2024 7:04:56 AM

1 kudos

I'm facing the same issue as described by @mrcity. There is no easy way to alter the dataframe, which is created inside the score_batch() function. Filtering out rows in the (sklearn) pipeline itself is also not convenient since these transformers ar...

1 kudos

08-14-2024 7:04:56 AM

2 More Replies

by Direo • Contributor II

04-07-2023 5:38:07 AM

1611 Views
1 replies
0 kudos

Operations applied when running fs.write_table to overwrite existing feature table in hive metastore

Hi,there was a need to query an older snapshot of a table. Therefore ran:deltaTable = DeltaTable.forPath(spark, 'dbfs:/<path>') display(deltaTable.history())and noticed that every fs.write_table run triggers two operations:Write and CREATE OR REPLACE...

Data Engineering

1611 Views
1 replies
0 kudos

04-07-2023 5:38:07 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 5:58:47 AM

0 kudos

@Direo Direo :When you use deltaTable.write() method to write a DataFrame into a Delta table, it actually triggers the Delta write operation internally. This operation performs two actions:It writes the new data to disk in the Delta format, andIt at...

0 kudos

04-10-2023 5:58:47 AM

by AmithAdiraju16 • New Contributor II

01-06-2023 9:14:55 AM

2786 Views
4 replies
1 kudos

How to read feature table without target_df / online inference based on filter_condition in databricks feature store

I'm using databricks feature store == 0.6.1. After I register my feature table with `create_feature_table` and write data with `write_Table` I want to read that feature_table based on filter conditions ( may be on time stamp column ) without calling ...

Data Engineering

2786 Views
4 replies
1 kudos

01-06-2023 9:14:55 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

01-06-2023 2:54:05 PM

1 kudos

create_training_set is just a simple Select from delta tables. All feature tables are just registered delta tables. Here is an example code that I used to handle that: customer_features_df = spark.sql("SELECT * FROM recommender_system.customer_fea...

1 kudos

01-06-2023 2:54:05 PM

3 More Replies

by spartakos • New Contributor

06-30-2022 8:29:42 AM

941 Views
0 replies
0 kudos

Big data ingest into Delta Lake

I have a feature table in BQ that I want to ingest into Delta Lake. This feature table in BQ has 100TB of data. This table can be partitioned by DATE.What best practices and approaches can I take to ingest this 100TB? In particular, what can I do to ...

Data Engineering

941 Views
0 replies
0 kudos

06-30-2022 8:29:42 AM

Databricks Community

Exclude absent lookup keys from dataframes made by create_training_set()

Operations applied when running fs.write_table to overwrite existing feature table in hive metastore

How to read feature table without target_df / online inference based on filter_condition in databricks feature store

Big data ingest into Delta Lake