Fetching data from Databricks server with delta sharing limits to 100 records

uddipak
New Contributor

Hi,

I'm trying to fetch a table from a Databricks instance hosted in Azure using delta sharing python library. The delta sharing library always returns a dataframe of length 100 when fetching table data. I tested all the tables shared with me. The instance is managed by a different team and they have confirmed there is no configuration which would limit the data shared.

import delta_sharing

client = delta_sharing.SharingClient("...")
client.list_all_tables() # this works fine
.
.
.
delta_sharing.load_as_pandas(table_url)
print(len(pandas_df)) # this prints 100
print(pandas_df.head(1000)) # this prints 100 records

 Is there any mistake in the code or is there some issue in configuring the Databricks delta sharing server?

Thanks.

szymon_dybczak
Esteemed Contributor III

Hi @uddipak ,

Maybe internally load_as_pandas has some default limit? Can you try to set limit explicitly?


import delta_sharing

client = delta_sharing.SharingClient("...")
client.list_all_tables() # this works fine
.
.
.
delta_sharing.load_as_pandas(table_url, limit=1000)
print(len(pandas_df)) # this prints 100
print(pandas_df.head(1000)) # this prints 100 records