cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

I am saving a new feature table to the Databricks feature store, and it won't write the data sources of the tables used to create the feature table, because they are Hive tables that point to Azure Data Lake Storage Gen1 Delta tables

Jack_Watson
Contributor

My notebook is pulling in Hive tables from DBFS, that point to ADLS Gen1 file locations for their data (Delta tables), creating the feature table as a data frame within the notebook, then calling on the feature store client to save down the feature table to the database I have created to save feature tables that go into the feature store. When I call 'create_table', it successfully creates and saves the feature table to the database, and it is viewable in the feature store, however it does not write the data sources of the tables used to create the feature table down, as the ADLS file paths are deemed to be invalid, and the error message states that the path name for the table must be a valid dbfs file path, even though the tables sit in DBFS, but point to the Azure data lake.

Error Message (Changed actual file path as to keep it confidential):

Exception: {'error_code': 'INVALID_PARAMETER_VALUE', 'message': 'Path name adl://****PATH_TO_DELTA_TABLE_IN_ADLS_GEN1_LAKE**** must be a valid dbfs:/ path.

I would like to know if there is a way I can get the feature store client to write the data sources, for the tables that create the feature table, as the table's actual ADLS file path?

1 ACCEPTED SOLUTION

Accepted Solutions

Atanu
Esteemed Contributor
Esteemed Contributor

@Jack Watson​  Could you please confirm the write is succeeding ? If yes, as per my understanding This is a warning for some validation that we will be removing shortly. We’ll likely remove the validation which save the data source.Thanks.

View solution in original post

6 REPLIES 6

Anonymous
Not applicable

Hello, @Jack Watson​! My name is Piper and I'm a moderator for the Databricks community. Thank you for asking and welcome to the community!

Let's give the other members a chance to respond before we circle back to you. Thanks in advance for your patience.

Kaniz
Community Manager
Community Manager

Hi @Jack Watson​ , You can go through this link. It has all you need about the feature store.

virtualzx
New Contributor II

@Jack Watson​  We have encountered a similar issue since we upgraded to the most recent build. The code that use to work does not work anymore. Basically if the spark dataframe is dynamically generated or backed by a cloud storage bucket the storage fails. However, you can get around this by temporarily save to dbfs first, then load it back out and save to the feature store. @Kaniz Fatma​  this is clearly a bug as the same code use to run without error and the code given in the link you gave will simply fail if the data source is loaded from a cloud storage bucket.

Kaniz
Community Manager
Community Manager

Hi @Jack Watson​ , Thank you so much for flagging this. I'll look into this and get back to you. Thanks.

Atanu
Esteemed Contributor
Esteemed Contributor

@Jack Watson​  Could you please confirm the write is succeeding ? If yes, as per my understanding This is a warning for some validation that we will be removing shortly. We’ll likely remove the validation which save the data source.Thanks.

Atanu
Esteemed Contributor
Esteemed Contributor

Though we do not have ETA at this moment.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.