cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

I am saving a new feature table to the Databricks feature store, and it won't write the data sources of the tables used to create the feature table, because they are Hive tables that point to Azure Data Lake Storage Gen1 Delta tables

Jack_Watson
Contributor

My notebook is pulling in Hive tables from DBFS, that point to ADLS Gen1 file locations for their data (Delta tables), creating the feature table as a data frame within the notebook, then calling on the feature store client to save down the feature table to the database I have created to save feature tables that go into the feature store. When I call 'create_table', it successfully creates and saves the feature table to the database, and it is viewable in the feature store, however it does not write the data sources of the tables used to create the feature table down, as the ADLS file paths are deemed to be invalid, and the error message states that the path name for the table must be a valid dbfs file path, even though the tables sit in DBFS, but point to the Azure data lake.

Error Message (Changed actual file path as to keep it confidential):

Exception: {'error_code': 'INVALID_PARAMETER_VALUE', 'message': 'Path name adl://****PATH_TO_DELTA_TABLE_IN_ADLS_GEN1_LAKE**** must be a valid dbfs:/ path.

I would like to know if there is a way I can get the feature store client to write the data sources, for the tables that create the feature table, as the table's actual ADLS file path?

1 ACCEPTED SOLUTION

Accepted Solutions

Atanu
Esteemed Contributor
Esteemed Contributor

@Jack Watson​  Could you please confirm the write is succeeding ? If yes, as per my understanding This is a warning for some validation that we will be removing shortly. We’ll likely remove the validation which save the data source.Thanks.

View solution in original post

6 REPLIES 6

Anonymous
Not applicable

Hello, @Jack Watson​! My name is Piper and I'm a moderator for the Databricks community. Thank you for asking and welcome to the community!

Let's give the other members a chance to respond before we circle back to you. Thanks in advance for your patience.

Kaniz_Fatma
Community Manager
Community Manager

Hi @Jack Watson​ , You can go through this link. It has all you need about the feature store.

virtualzx
New Contributor II

@Jack Watson​  We have encountered a similar issue since we upgraded to the most recent build. The code that use to work does not work anymore. Basically if the spark dataframe is dynamically generated or backed by a cloud storage bucket the storage fails. However, you can get around this by temporarily save to dbfs first, then load it back out and save to the feature store. @Kaniz Fatma​  this is clearly a bug as the same code use to run without error and the code given in the link you gave will simply fail if the data source is loaded from a cloud storage bucket.

Hi @Jack Watson​ , Thank you so much for flagging this. I'll look into this and get back to you. Thanks.

Atanu
Esteemed Contributor
Esteemed Contributor

@Jack Watson​  Could you please confirm the write is succeeding ? If yes, as per my understanding This is a warning for some validation that we will be removing shortly. We’ll likely remove the validation which save the data source.Thanks.

Atanu
Esteemed Contributor
Esteemed Contributor

Though we do not have ETA at this moment.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!