cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Credential Sharing Across Cluster Nodes - spark.conf()

turagittech
Contributor

Hi All,

I am struggling to understand how to manage credentials for azure storage across cluster when trying to use Azure python libraries within functions that may end up on the cluster worker nodes.

I am building a task to load blobs from Azure storage where the storage doesn't have HNS enabled this means I cant use dlt/lakeflow to load data. That sucks. I have configured spark.conf settings in the cluster config, and I can see in logs it allegedly(no failures) setting the settings, however I am getting this

Failure to initialize configuration for azue blob store azure.blob.storage
Invalid configuration value detected for fs.azure.account.key Invalid configuration value detected for fs.azure.account.key

The error is being triggered by a spark.read.json(filename) line of code when it is read to write contents to a table

ClusterConfig

SPARK

spark.conf.set("oauth_string", "OAuth")
spark.conf.set(fs_oauth_provider_string, "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set(fs_oauth_client_id_string, application_id)
spark.conf.set("s_oauth_client_secret_string, service_credential)
spark.conf.set(fs_oauth_client_endpoint_string, oauth_login)
spark.conf.set(fs.azure.account.key.storageaccount.blob.core.windows.net, dbutils.secrets.get(scope=key_scope, key=service_credential_key

EnvironmentVariables

PYSPARK_PYTHON=/databricks/python3/bin/python3
DATABRICKS_DEFAULT_SERVICE_CREDENTIAL_NAME=dbricks-connector-id
directory_id=aguid
key_scope=Secrets
service_credential_key=secret-accesskey
oauth_string=fs.azure.account.oauth.provider.type.{storage_account}.blob.core.windows.net
fs_oauth_provider_string=fs.azure.accountoauth.provider.type.{storage_account}.blob.core.windows.net
fs_oauth_client_id_string=fs.azure.account.oauth2.client.id.{storage_account}.blob.core.windows.net
fs_oauth_client_secret_string=fs.azure.account.oauth2.client.secret.{storage_account}.blob.core.windows.net
fs_oauth_client_endpoint_string=fs.azure.account.oauth2.client.endpoint.{storage_account}.blob.core.windows.net
oauth_login=https://login.microsoftonline.com/{directory_id}/oauth2/token

I've changed a few values for security reasons

If I am, what am I configuring wrong?

If not the config, what else is wrong

1 REPLY 1

Sidhant07
Databricks Employee
Databricks Employee

Hi @turagittech ,

I found a KB article related to this error.Let me know if this helps.

https://kb.databricks.com/data-sources/keyproviderexception-error-when-trying-to-create-an-external-...