Hi All,
I am struggling to understand how to manage credentials for azure storage across cluster when trying to use Azure python libraries within functions that may end up on the cluster worker nodes.
I am building a task to load blobs from Azure storage where the storage doesn't have HNS enabled this means I cant use dlt/lakeflow to load data. That sucks. I have configured spark.conf settings in the cluster config, and I can see in logs it allegedly(no failures) setting the settings, however I am getting this
Failure to initialize configuration for azue blob store azure.blob.storage
Invalid configuration value detected for fs.azure.account.key Invalid configuration value detected for fs.azure.account.key
The error is being triggered by a spark.read.json(filename) line of code when it is read to write contents to a table
ClusterConfig
SPARK
spark.conf.set("oauth_string", "OAuth")
spark.conf.set(fs_oauth_provider_string, "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set(fs_oauth_client_id_string, application_id)
spark.conf.set("s_oauth_client_secret_string, service_credential)
spark.conf.set(fs_oauth_client_endpoint_string, oauth_login)
spark.conf.set(fs.azure.account.key.storageaccount.blob.core.windows.net, dbutils.secrets.get(scope=key_scope, key=service_credential_key
EnvironmentVariables
PYSPARK_PYTHON=/databricks/python3/bin/python3
DATABRICKS_DEFAULT_SERVICE_CREDENTIAL_NAME=dbricks-connector-id
directory_id=aguid
key_scope=Secrets
service_credential_key=secret-accesskey
oauth_string=fs.azure.account.oauth.provider.type.{storage_account}.blob.core.windows.net
fs_oauth_provider_string=fs.azure.accountoauth.provider.type.{storage_account}.blob.core.windows.net
fs_oauth_client_id_string=fs.azure.account.oauth2.client.id.{storage_account}.blob.core.windows.net
fs_oauth_client_secret_string=fs.azure.account.oauth2.client.secret.{storage_account}.blob.core.windows.net
fs_oauth_client_endpoint_string=fs.azure.account.oauth2.client.endpoint.{storage_account}.blob.core.windows.net
oauth_login=https://login.microsoftonline.com/{directory_id}/oauth2/token
I've changed a few values for security reasons
If I am, what am I configuring wrong?
If not the config, what else is wrong