07-11-2023 04:44 AM
Hello,
After implementing the use of Secret Scope to store Secrets in an azure key vault, i faced a problem.
When writting an output to the blob i get the following error:
shaded.databricks.org.apache.hadoop.fs.azure.AzureException: Unable to access container analysis in account [REDACTED].blob.core.windows.net using anonymous credentials, and no credentials found for them in the configuration.
After some investigation, it is related to the following config previously set in the advanced configuration of the cluster configuration:
"spark.hadoop.fs.azure.account.key.y.blob.core.windows.net", "myStorageAccountKey"
I would like to find the way to set this in the notebook level after retrieving the secret from the secret scope:
spark.conf.set("spark.hadoop.fs.azure.account.key.y.blob.core.windows.net", "myStorageAccountKey")
Unfortunatly this does not work.
Here below an example of how i write the output:
df.write.format("com.crealytics.spark.excel") \
.option("dataAddress", "'%s'!A1" %(sheetName)) \
.option("header", "true") \
.option("dateFormat", "yy-mm-d") \
.option("timestampFormat", "mm-dd-yyyy hh:mm:ss") \
.option("useHeader", "true") \
.mode("append") \
.save( "%s/%s" %(output_blob_folder,outputName))
07-12-2023 01:39 AM
Hi all thank you for the suggestions.
Doing This
spark.conf.set("fs.azure.account.key.{storage_account}.dfs.core.windows.net", "{myStorageAccountKey}")
For the hadoop configuration does not work.
And the suggestion of @Tharun-Kumar would suggest to hardcode secrets in the configuration which is a big no.
Someone else suggested the proper solution on stack overflow which is to add in the same location @Tharun-Kumar suggested to add this, but pointing at the secret scope at the same time:
spark.hadoop.fs.azure.account.key.<account_name>.blob.core.windows.net {{secrets/<secret-scope-name>/<secret-name>}}
07-11-2023 10:23 AM
You need to edit the Spark Config by entering the connection information for your Azure Storage account.
Enter the following:
spark.hadoop.fs.azure.account.key.<STORAGE_ACCOUNT_NAME>.blob.core.windows.net <ACCESS_KEY>
where <STORAGE_ACCOUNT_NAME> is your Azure Storage account name, and <ACCESS_KEY> is your storage access key.
You need to include this in your spark configs and restart the cluster to overcome this issue.
07-12-2023 01:35 AM
Hello unfortunatly this is not the desired solution as this involves hardcoding the secret in the configuration of the cluster. I posted the question on stack overflow https://stackoverflow.com/questions/76655569/databricks-not-allowing-to-write-output-to-folder-using... And got the desired answer.
In the Cluster Configuration i wrote the following:
spark.hadoop.fs.azure.account.key.<account_name>.blob.core.windows.net {{secrets/<secret-scope-name>/<secret-name>}}
07-11-2023 02:20 PM
Please refer to the doc Connect to Azure Data Lake Storage Gen2 and Blob Storage - Azure Databricks | Microsoft Learn. This has the command that you can use to set the spark config from the notebook level.
service_credential = dbutils.secrets.get(scope="<secret-scope>",key="<service-credential-key>")
spark.conf.set("fs.azure.account.auth.type.<storage-account>.dfs.core.windows.net", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type.<storage-account>.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id.<storage-account>.dfs.core.windows.net", "<application-id>")
spark.conf.set("fs.azure.account.oauth2.client.secret.<storage-account>.dfs.core.windows.net", service_credential)
spark.conf.set("fs.azure.account.oauth2.client.endpoint.<storage-account>.dfs.core.windows.net", "https://login.microsoftonline.com/<directory-id>/oauth2/token")
Replace
07-11-2023 08:06 PM
@Prabakar you are using the service principle here.
07-11-2023 06:46 PM - edited 07-11-2023 07:17 PM
Hello @TheoDeSo , just simply rewrite the configuration:
spark.conf.set("fs.azure.account.key.{storage_account}.dfs.core.windows.net", "{myStorageAccountKey}")
use this uri to access the storage account: abfss://{container_name}@{storage_account}.dfs.core.windows.net/
You can check using: dbutils.fs.ls(" abfss://{container_name}@{storage_account}.dfs.core.windows.net/")
07-12-2023 12:40 AM
Hi @TheoDeSo
Thank you for posting your question in our community! We are happy to assist you.
To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?
This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!
07-12-2023 01:39 AM
Hi all thank you for the suggestions.
Doing This
spark.conf.set("fs.azure.account.key.{storage_account}.dfs.core.windows.net", "{myStorageAccountKey}")
For the hadoop configuration does not work.
And the suggestion of @Tharun-Kumar would suggest to hardcode secrets in the configuration which is a big no.
Someone else suggested the proper solution on stack overflow which is to add in the same location @Tharun-Kumar suggested to add this, but pointing at the secret scope at the same time:
spark.hadoop.fs.azure.account.key.<account_name>.blob.core.windows.net {{secrets/<secret-scope-name>/<secret-name>}}
a week ago - last edited a week ago
Hi all,
Is it correct that Azure-Databricks only support to write data to Azure Data Lake Gen2 and does not support for Azure Storage Blob (StorageV2 - general purpose) ?
In my case, I can read the data from Azure Storage Blob (StorageV2 - general purpose v2) to Databricks, but when writing data back from Databricks to that Azure blob, it shows error.
Here is my code
Any idea, please help!
Thanks
Mo
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group