cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta sharing open protocol in Unity catalog: FileNotFoundError

kiko_roy
Contributor

Hi Team

I have created a recipient under delta sharing (azure databricks) . Unity catalog is enabled and data is stored in ADLS gen2. I have downloaded the credential file and trying to resue in my python script (as per databricks documentation) for a POC activity from my local machine (jupyter notebook) . delta_sharing package is successfully installed in my local system 

import delta_sharing
import pandas

client = delta_sharing.SharingClient(f"C:/Users/XXXXXX/Downloads/config.share")

client.list_all_tables() 

delta_sharing.load_as_pandas(f"C:/Users/XXXXX/Downloads/config.share#<sharename>.<schemaname>.<tablename>")

I am able to see the successful listing of the tables, but when trying to load the a particular table data as a pandas dataframe , getting error :

FileNotFoundError: https://......................._unitystorage/schemas/............../tables/.............../part-0000...

 can someone suggest why is it erroring and how can I resolve?

1 ACCEPTED SOLUTION

Accepted Solutions

I just reproduced it for you. It is 100% the networking on the Azure storage account. 

jacovangelder_0-1718124274401.png

 

View solution in original post

5 REPLIES 5

jacovangelder
Contributor III

I wasn't able to reproduce your issue. Is your delta table operable? can you see sample data from within databricks and query the table from within databricks? It almost looks like some parquet files are missing, causing your delta not queryable anymore. 

 

yes My delta table is operable ,it has sample data and can be queried from sql-warehouse as well as notebooks. If I run the script from within databricks env (i.e same metastore) ,Data can be read with delta sharing. But If I run it outside from a jupyter notebook , unable to read . 

Then I am 100% sure that it is because your azure data lake storage does not have public network access enabled, or has a firewall or private endpoint setup. In order to query delta shares, you need to be able to access the storage account where the delta tables reside in. 

I just reproduced it for you. It is 100% the networking on the Azure storage account. 

jacovangelder_0-1718124274401.png

 

Thanks @jacovangelder for checking this out. Even I was supposing so. I tested by running the code in a workspace under same metastore , and could read the data, and when ran the same code from a workspace of a different metastore had a same issue. So this test reconfirms. Surely my system is under private VNET without any endpoints created (as checked with my networking team later)

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!