<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic autoloader error using unity catalog in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/autoloader-error-using-unity-catalog/m-p/92977#M38599</link>
    <description>&lt;P&gt;Hello!&amp;nbsp;&lt;BR /&gt;I'm new on Databricks and I'm exploring some of its features.&lt;/P&gt;&lt;P&gt;I've successfully configured a workspace with unity catalog, one external storage location (ADLSg2) and the associated storage credential. I provided all privileges for all account users and try 'test connection' to ensure that everything is ok.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;When I run the following command:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;SPAN&gt;input_file_path &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;"abfss://&amp;lt;my_container&amp;gt;@&amp;lt;my_storage_account&amp;gt;.dfs.core.windows.net&amp;lt;my_path&amp;gt;"&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;schema &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;parquet&lt;/SPAN&gt;&lt;SPAN&gt;(input_file_path).schema&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;I was able to read my parquet file and obtain the schema.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;When I tried the following code to test the autoloader capabilities:&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;df =&amp;nbsp;&amp;nbsp;&lt;SPAN&gt;spark.readStream.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles"&lt;/SPAN&gt;&lt;SPAN&gt;) \&lt;/SPAN&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.format"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"parquet"&lt;/SPAN&gt;&lt;SPAN&gt;) \&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.inferColumnTypes"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"true"&lt;/SPAN&gt;&lt;SPAN&gt;) \&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.schemaLocation"&lt;/SPAN&gt;&lt;SPAN&gt;, &amp;lt;schema_location_path&amp;gt;) \&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;load&lt;/SPAN&gt;&lt;SPAN&gt;(input_file_path)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;I received the following error:&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;Failure to initialize configuration for storage account &amp;lt;my_storage_account&amp;gt;.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid...&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;How is it possibile that I can read my file using the standard read() function but I'm not able to read it with load()?&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Mon, 07 Oct 2024 15:58:43 GMT</pubDate>
    <dc:creator>garf</dc:creator>
    <dc:date>2024-10-07T15:58:43Z</dc:date>
    <item>
      <title>autoloader error using unity catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-error-using-unity-catalog/m-p/92977#M38599</link>
      <description>&lt;P&gt;Hello!&amp;nbsp;&lt;BR /&gt;I'm new on Databricks and I'm exploring some of its features.&lt;/P&gt;&lt;P&gt;I've successfully configured a workspace with unity catalog, one external storage location (ADLSg2) and the associated storage credential. I provided all privileges for all account users and try 'test connection' to ensure that everything is ok.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;When I run the following command:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;SPAN&gt;input_file_path &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;"abfss://&amp;lt;my_container&amp;gt;@&amp;lt;my_storage_account&amp;gt;.dfs.core.windows.net&amp;lt;my_path&amp;gt;"&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;schema &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;parquet&lt;/SPAN&gt;&lt;SPAN&gt;(input_file_path).schema&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;I was able to read my parquet file and obtain the schema.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;When I tried the following code to test the autoloader capabilities:&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;df =&amp;nbsp;&amp;nbsp;&lt;SPAN&gt;spark.readStream.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles"&lt;/SPAN&gt;&lt;SPAN&gt;) \&lt;/SPAN&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.format"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"parquet"&lt;/SPAN&gt;&lt;SPAN&gt;) \&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.inferColumnTypes"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"true"&lt;/SPAN&gt;&lt;SPAN&gt;) \&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.schemaLocation"&lt;/SPAN&gt;&lt;SPAN&gt;, &amp;lt;schema_location_path&amp;gt;) \&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;load&lt;/SPAN&gt;&lt;SPAN&gt;(input_file_path)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;I received the following error:&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;Failure to initialize configuration for storage account &amp;lt;my_storage_account&amp;gt;.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid...&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;How is it possibile that I can read my file using the standard read() function but I'm not able to read it with load()?&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 07 Oct 2024 15:58:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-error-using-unity-catalog/m-p/92977#M38599</guid>
      <dc:creator>garf</dc:creator>
      <dc:date>2024-10-07T15:58:43Z</dc:date>
    </item>
    <item>
      <title>Re: autoloader error using unity catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-error-using-unity-catalog/m-p/93159#M38634</link>
      <description>&lt;P&gt;hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/125107"&gt;@garf&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;could you please try to create an external volume using your external location and then use the file path in the volume as the input file path?&lt;/P&gt;</description>
      <pubDate>Tue, 08 Oct 2024 15:38:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-error-using-unity-catalog/m-p/93159#M38634</guid>
      <dc:creator>Mo</dc:creator>
      <dc:date>2024-10-08T15:38:52Z</dc:date>
    </item>
    <item>
      <title>Re: autoloader error using unity catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/autoloader-error-using-unity-catalog/m-p/94132#M38820</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/28727"&gt;@Mo&lt;/a&gt;&amp;nbsp;,&amp;nbsp;&lt;/P&gt;&lt;P&gt;thank you for the quick feedback and sorry for the late reply.&lt;/P&gt;&lt;P&gt;The issue was related to '&lt;SPAN&gt;schema_location_path&lt;/SPAN&gt;'&amp;nbsp; Azure container.&lt;/P&gt;&lt;P&gt;I forgot to register 'schema_location_path' container as &lt;EM&gt;external location&lt;/EM&gt; and the script was not able to read from &lt;STRONG&gt;this&lt;/STRONG&gt; specific location.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I added the specific external location and fixed the problem.&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Tue, 15 Oct 2024 15:00:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/autoloader-error-using-unity-catalog/m-p/94132#M38820</guid>
      <dc:creator>garf</dc:creator>
      <dc:date>2024-10-15T15:00:46Z</dc:date>
    </item>
  </channel>
</rss>

