- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2025 02:09 AM
Hello Guys,
I have some issue when i load several json files which have a same schema on databricks. when i do
2025_07_17_19_55_00_2025_07_31_21_55_00_17Q51D_alice_out.json 516.13 KB
2025_07_17_19_55_00_2025_07_31_21_55_00_17Q51D_bob_out.json 516.13 KB
2025_08_10_21_55_00_2025_08_24_21_55_00_17Q1D_alice_out.json 514.13 KB
2025_08_10_21_55_00_2025_08_24_21_55_00_17Q51D_bob_out.json 418.13 KB
df = spark.read.format("json").options(**options).load("Volumes/folder/dir1")
it pick up randomly two files
someone know how to solve this issue?
Cordially,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2025 02:19 AM
sometine the dataframe return nothing. To enforce i have add but doesnt load all files present on Volume
Pathglobfilter="*{*alice*}*.json"- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2025 02:49 AM
Hi @seefoods ,
Could you also share with us how do you check if some json files are not loaded?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2025 08:32 AM
Hello @szymon_dybczak ,
Its Ok i have check the history of the table. I'm so confuse about the command display() output and the really output write operation
Thanx
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2025 08:56 AM
Hi @seefoods ,
That's what I suspected. When you use display() method in Azure Databricks to view a DataFrame, the number of rows displayed is limited to prevent browser crashes.
The same applies to notebook cell outputs. Table results are limited to 10,000 rows or 2 MB, whichever is lower.
Known limitations Databricks notebooks | Databricks on AWS
So, more reliable way of checking is for example to perform count operation on dataframe.