cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

ADLS gen2 config issue

mr_poola49
New Contributor III

I am new to Azure Databricks. I am trying to access ADLS gen2 from Azure Databricks. I've set all the required configurations in the notebook but when I try to query the table using SPAR.SQL(), it is throwing exception "Failure to initialize configuration for storage account". I am using OAuth using client id and secret to authenticate the storage account. One strange thing is I have set the spark configs to  prodadlsg2  but the error is pointing to different storage account ppeadlsg2. Not sure how to correct this issue. I would like to connect to prod storage instead of  ppeadlsg2. Also, I want to remove the configs related to PPE storage account. Please assist me in this regard. I am using 15.4 LTS DBR. Actual storage account names are different from what is given in the code snippet due to data sensitivity.

 

 

spark.conf.set("fs.azure.account.auth.type.prodadlsg2.dfs.core.windows.net", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type.prodadlsg2.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id.prodadlsg2.dfs.core.windows.net", <application_id>)
spark.conf.set("fs.azure.account.oauth2.client.secret.prodadlsg2.dfs.core.windows.net", <application_secret>)
spark.conf.set("fs.azure.account.oauth2.client.endpoint.prodadlsg2.dfs.core.windows.net", s"https://login.microsoftonline.com/<tenantId>/oauth2/token")

val sqlContext = spark.sqlContext
val tableNamesInDb = sqlContext.tableNames("default").toSet

val tableName = "some_table"
val tableExists = tableNamesInDb.contains(tableName)

if (!tableExists) {
spark.sql(s"CREATE TABLE $tableName USING DELTA LOCATION 'abfss://$StorageContainer@prodadlsg2.dfs.core.windows.net/$standardizationRootPath/$entity/$version/standard'")
}

val countDf = spark.sql(s"SELECT COUNT(*) AS record_count FROM $tableName")
display(countDf)
KeyProviderException: Failure to initialize configuration for storage account ppeadlsg2.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.key
Caused by: InvalidConfigurationValueException: Invalid configuration value detected for fs.azure.account.key
	

 

 

 

3 REPLIES 3

szymon_dybczak
Contributor III

Hi @mr_poola49,

Did you assigned proper role for service principal in this storage account? Also, could you try to perform following test:

- try to set the secrets in the cluster configuration, not in the "notebook" (like in below screenshot)

szymon_dybczak_0-1728323695006.png

 

 

I know that it sounds weird, but I had similar problem in the past even though spark.conf was set correctly in the "notebook". 

Hi @szymon_dybczak ,

I tried above settings, but results same error.

mr_poola49
New Contributor III

Issue is resolved! dropped the table from hive_metastore which is pointing to  ppeadlsg2  storage container and re-created it using prodadlsg2 storage.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group