I'm trying to access init script which is stored on mounted azure data lake storage gen2 to dbfs
I mounted storage to
dbfs:/mnt/storage/container/script.sh
and when i try to access it
i got an error:
Cluster scoped init script dbfs:/mnt/storage/container/script.sh failed: Timed out with exception after 5 attempts (debugStr = 'Reading remote file for init script'), Caused by: java.io.FileNotFoundException: /WORKSPACE_ID/mnt/storage/container/script.sh: No such file or directory.
1) I see this file in dbfs using magic "%sh" command in notebook
2) I can read from this path using a spark.read...
in docs i found
https://docs.databricks.com/dbfs/unity-catalog.html#use-dbfs-while-launching-unity-catalog-clusters-...
Databricks recommends using DBFS mounts for init scripts, configurations, and libraries stored in external storage. This behavior is not supported in shared access mode.
When i try to access this file using
abfss:// i got an error:
Failure to initialize configuration for storage account storage_name.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.key, Caused by: Invalid configuration value detected for fs.azure.account.key.)
but i used the same credentials like in "mount credentials" in previous way.
Does init scripts have any limitations with mounted dbfs?
I am concerned about the added workspace id in the error message at the beginning of the path
I'm using the exactly the same path which i get using this command:
dbutils.fs.ls("/mnt/storage/container/script.sh")
I assume that when calling this command, the cluster is not yet running so I cannot travel ADLS. So i should use abfss:// instead
But how to authenticate with this storage, i tried this way
https://learn.microsoft.com/en-us/azure/databricks/storage/azure-storage#--access-azure-data-lake-st...
using service principal in spark config but it doesnt work.
Is this storage should be public?