<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: read json files on unity catalog in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130865#M48929</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/65591"&gt;@seefoods&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;That's what I suspected. When you use display()&amp;nbsp;&lt;SPAN&gt;method in Azure Databricks to view a DataFrame, the number of rows displayed is limited to prevent browser crashes.&lt;BR /&gt;The same applies to notebook cell outputs. Table results are limited to 10,000 rows or 2 MB, whichever is lower.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;A href="https://docs.databricks.com/aws/en/notebooks/notebook-limitations#notebook-cell-outputs" target="_blank"&gt;Known limitations Databricks notebooks | Databricks on AWS&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;So, more reliable way of checking is for example to perform count operation on dataframe.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 04 Sep 2025 15:56:46 GMT</pubDate>
    <dc:creator>szymon_dybczak</dc:creator>
    <dc:date>2025-09-04T15:56:46Z</dc:date>
    <item>
      <title>read json files on unity catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130791#M48901</link>
      <description>&lt;P&gt;Hello Guys,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have some issue when i load several json files which have a same schema on databricks. when i do&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;2025_07_17_19_55_00_2025_07_31_21_55_00_17Q51D_alice_out&lt;/SPAN&gt;&lt;SPAN class=""&gt;.json&amp;nbsp;516.13 KB&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;2025_07_17_19_55_00_2025_07_31_21_55_00_17Q51D_bob_out&lt;/SPAN&gt;&lt;SPAN class=""&gt;.json&amp;nbsp;516.13 KB&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;2025_08_10_21_55_00_2025_08_24_21_55_00_17Q1D_alice_out&lt;/SPAN&gt;&lt;SPAN class=""&gt;.json&amp;nbsp;514.13 KB&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;2025_08_10_21_55_00_2025_08_24_21_55_00_17Q51D_bob_out&lt;/SPAN&gt;&lt;SPAN class=""&gt;.json 418&lt;SPAN class=""&gt;.13 KB&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;options &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; {&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;"multiLine"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;"inferSchema"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;"allowUnquotedFieldNames"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;"allowSingleQuotes"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;"allowBackslashEscapingAnyCharacter"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;"recursiveFileLookup"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;df = spark.read.format("json").options(**options).load("Volumes/folder/dir1")&lt;BR /&gt;it pick up randomly two files&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;someone know how to solve this issue?&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Cordially,&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Sep 2025 09:09:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130791#M48901</guid>
      <dc:creator>seefoods</dc:creator>
      <dc:date>2025-09-04T09:09:30Z</dc:date>
    </item>
    <item>
      <title>Re: read json files on unity catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130794#M48903</link>
      <description>&lt;P&gt;sometine the dataframe return nothing. To enforce i have add but doesnt load all files present on Volume&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;&lt;SPAN&gt;Pathglobfilter="*{*alice*}*.json"&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 04 Sep 2025 09:19:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130794#M48903</guid>
      <dc:creator>seefoods</dc:creator>
      <dc:date>2025-09-04T09:19:31Z</dc:date>
    </item>
    <item>
      <title>Re: read json files on unity catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130807#M48913</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/65591"&gt;@seefoods&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Could you also share with us how do you check if some json files are not loaded?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Sep 2025 09:49:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130807#M48913</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-09-04T09:49:10Z</dc:date>
    </item>
    <item>
      <title>Re: read json files on unity catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130863#M48928</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;&amp;nbsp;,&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Its Ok i have check the history of the table. I'm so confuse about the command display() output and the really output write operation&lt;BR /&gt;&lt;BR /&gt;Thanx&lt;/P&gt;</description>
      <pubDate>Thu, 04 Sep 2025 15:32:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130863#M48928</guid>
      <dc:creator>seefoods</dc:creator>
      <dc:date>2025-09-04T15:32:26Z</dc:date>
    </item>
    <item>
      <title>Re: read json files on unity catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130865#M48929</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/65591"&gt;@seefoods&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;That's what I suspected. When you use display()&amp;nbsp;&lt;SPAN&gt;method in Azure Databricks to view a DataFrame, the number of rows displayed is limited to prevent browser crashes.&lt;BR /&gt;The same applies to notebook cell outputs. Table results are limited to 10,000 rows or 2 MB, whichever is lower.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;A href="https://docs.databricks.com/aws/en/notebooks/notebook-limitations#notebook-cell-outputs" target="_blank"&gt;Known limitations Databricks notebooks | Databricks on AWS&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;So, more reliable way of checking is for example to perform count operation on dataframe.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Sep 2025 15:56:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-json-files-on-unity-catalog/m-p/130865#M48929</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-09-04T15:56:46Z</dc:date>
    </item>
  </channel>
</rss>

