read json files on unity catalog

seefoods
Valued Contributor

Hello Guys, 

 

I have some issue when i load several json files which have a same schema on databricks. when i do

2025_07_17_19_55_00_2025_07_31_21_55_00_17Q51D_alice_out.json 516.13 KB

2025_07_17_19_55_00_2025_07_31_21_55_00_17Q51D_bob_out.json 516.13 KB

2025_08_10_21_55_00_2025_08_24_21_55_00_17Q1D_alice_out.json 514.13 KB

2025_08_10_21_55_00_2025_08_24_21_55_00_17Q51D_bob_out.json 418.13 KB

 

options = {
"multiLine": True,
"inferSchema": True,
"allowUnquotedFieldNames": True,
"allowSingleQuotes": True,
"allowBackslashEscapingAnyCharacter": True,
"recursiveFileLookup": True,
}

 
df = spark.read.format("json").options(**options).load("Volumes/folder/dir1")
it pick up randomly two files 

someone know how to solve this issue? 


Cordially, 

seefoods
Valued Contributor

sometine the dataframe return nothing. To enforce i have add but doesnt load all files present on Volume

Pathglobfilter="*{*alice*}*.json"

szymon_dybczak
Esteemed Contributor III

Hi @seefoods ,

Could you also share with us how do you check if some json files are not loaded? 

seefoods
Valued Contributor

Hello @szymon_dybczak , 

Its Ok i have check the history of the table. I'm so confuse about the command display() output and the really output write operation

Thanx

szymon_dybczak
Esteemed Contributor III

Hi @seefoods ,

That's what I suspected. When you use display() method in Azure Databricks to view a DataFrame, the number of rows displayed is limited to prevent browser crashes.
The same applies to notebook cell outputs. Table results are limited to 10,000 rows or 2 MB, whichever is lower.

Known limitations Databricks notebooks | Databricks on AWS

So, more reliable way of checking is for example to perform count operation on dataframe. 

View solution in original post