<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Init Scripts with mounted azure data lake storage gen2 in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/init-scripts-with-mounted-azure-data-lake-storage-gen2/m-p/7814#M3588</link>
    <description>&lt;P&gt;I do not think the init script saved under mount point work and we do not suggest that. &lt;/P&gt;&lt;P&gt;If you specify abfss , then the cluster need to be configured so that the cluster can authenticate and access the adls gen2 folder. Otherwise, the cluster will not be able to load the init script to run during the start up &lt;/P&gt;</description>
    <pubDate>Wed, 22 Mar 2023 23:01:57 GMT</pubDate>
    <dc:creator>User16752239289</dc:creator>
    <dc:date>2023-03-22T23:01:57Z</dc:date>
    <item>
      <title>Init Scripts with mounted azure data lake storage gen2</title>
      <link>https://community.databricks.com/t5/data-engineering/init-scripts-with-mounted-azure-data-lake-storage-gen2/m-p/7813#M3587</link>
      <description>&lt;P&gt;I'm trying to access init script which is stored on mounted azure data lake storage gen2 to dbfs&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I mounted storage to &lt;/P&gt;&lt;P&gt;dbfs:/mnt/storage/container/&lt;A href="https://script.sh" alt="https://script.sh" target="_blank"&gt;script.sh&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;and when i try to access it &lt;/P&gt;&lt;P&gt;i got an error:&lt;/P&gt;&lt;P&gt;Cluster scoped init script dbfs:/mnt/storage/container/&lt;A href="https://script.sh" alt="https://script.sh" target="_blank"&gt;script.sh&lt;/A&gt; failed: Timed out with exception after 5 attempts (debugStr = 'Reading remote file for init script'), Caused by: java.io.FileNotFoundException: /WORKSPACE_ID/mnt/storage/container/&lt;A href="https://script.sh" alt="https://script.sh" target="_blank"&gt;script.sh&lt;/A&gt;: No such file or directory.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1) I see this file in dbfs using magic "%sh" command in notebook&lt;/P&gt;&lt;P&gt;2) I can read from this path using a spark.read...&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;in docs i found &lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/dbfs/unity-catalog.html#use-dbfs-while-launching-unity-catalog-clusters-with-single-user-access-mode" alt="https://docs.databricks.com/dbfs/unity-catalog.html#use-dbfs-while-launching-unity-catalog-clusters-with-single-user-access-mode" target="_blank"&gt;https://docs.databricks.com/dbfs/unity-catalog.html#use-dbfs-while-launching-unity-catalog-clusters-with-single-user-access-mode&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Databricks recommends using DBFS mounts for init scripts, configurations, and libraries stored in external storage. This behavior is not supported in shared access mode.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;When i try to access this file using &lt;/P&gt;&lt;P&gt;abfss:// i got an error:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Failure to initialize configuration for storage account storage_name.&lt;A href="https://dfs.core.windows.net" alt="https://dfs.core.windows.net" target="_blank"&gt;dfs.core.windows.net&lt;/A&gt;: Invalid configuration value detected for fs.azure.account.key, Caused by: Invalid configuration value detected for fs.azure.account.key.)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;but i used the same credentials like in "mount credentials" in previous way. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Does init scripts have any limitations&amp;nbsp;with mounted dbfs? &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am concerned about the added workspace id in the error message at the beginning of the path&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm using the exactly the same path which i get using this command:&lt;/P&gt;&lt;P&gt;&lt;A href="https://dbutils.fs.ls/" alt="https://dbutils.fs.ls/" target="_blank"&gt;dbutils.fs.ls&lt;/A&gt;("/mnt/storage/container/&lt;A href="https://script.sh/" alt="https://script.sh/" target="_blank"&gt;script.sh&lt;/A&gt;")&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I assume that when calling this command, the cluster is not yet running so I cannot travel ADLS. So i should use abfss:// instead &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But how to authenticate with this storage, i tried this way&lt;/P&gt;&lt;P&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/storage/azure-storage#--access-azure-data-lake-storage-gen2-or-blob-storage-using-oauth-20-with-an-azure-service-principal" alt="https://learn.microsoft.com/en-us/azure/databricks/storage/azure-storage#--access-azure-data-lake-storage-gen2-or-blob-storage-using-oauth-20-with-an-azure-service-principal" target="_blank"&gt;https://learn.microsoft.com/en-us/azure/databricks/storage/azure-storage#--access-azure-data-lake-storage-gen2-or-blob-storage-using-oauth-20-with-an-azure-service-principal&lt;/A&gt; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;using service principal in spark config but it doesnt work.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is this storage should be public?&lt;/P&gt;</description>
      <pubDate>Mon, 13 Mar 2023 16:37:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/init-scripts-with-mounted-azure-data-lake-storage-gen2/m-p/7813#M3587</guid>
      <dc:creator>repcak</dc:creator>
      <dc:date>2023-03-13T16:37:15Z</dc:date>
    </item>
    <item>
      <title>Re: Init Scripts with mounted azure data lake storage gen2</title>
      <link>https://community.databricks.com/t5/data-engineering/init-scripts-with-mounted-azure-data-lake-storage-gen2/m-p/7814#M3588</link>
      <description>&lt;P&gt;I do not think the init script saved under mount point work and we do not suggest that. &lt;/P&gt;&lt;P&gt;If you specify abfss , then the cluster need to be configured so that the cluster can authenticate and access the adls gen2 folder. Otherwise, the cluster will not be able to load the init script to run during the start up &lt;/P&gt;</description>
      <pubDate>Wed, 22 Mar 2023 23:01:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/init-scripts-with-mounted-azure-data-lake-storage-gen2/m-p/7814#M3588</guid>
      <dc:creator>User16752239289</dc:creator>
      <dc:date>2023-03-22T23:01:57Z</dc:date>
    </item>
  </channel>
</rss>

