cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Invalid configuration fs.azure.account.key trying to load ML Model with OAuth

CarstenWeber
New Contributor II

Hi Community,
i was trying to load a ML Model from a Azure Storageaccount (abfss://....) with:

 

model = PipelineModel.load(path)

 

i set the spark config: 

 

spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",)
spark.conf.set("fs.azure.account.oauth2.client.id", client_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", client_secret)
spark.conf.set("fs.azure.account.oauth2.client.endpoint","https://login.microsoftonline.com/<tenant_id>/oauth2/token")

 

and i always get the following error:

 

Py4JJavaError: An error occurred while calling o772.partitions.
: Failure to initialize configuration for storage account <storage>.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:52)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:682)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:2076)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.<init>(AzureBlobFileSystemStore.java:268)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:235)
	at com.databricks.common.filesystem.LokiABFS.initialize(LokiABFS.scala:36)
	at com.databricks.common.filesystem.LokiFileSystem$.$anonfun$getLokiFS$1(LokiFileSystem.scala:154)
	at com.databricks.common.filesystem.FileSystemCache.getOrCompute(FileSystemCache.scala:46)
	at com.databricks.common.filesystem.LokiFileSystem$.getLokiFS(LokiFileSystem.scala:151)
	at com.databricks.common.filesystem.LokiFileSystem.initialize(LokiFileSystem.scala:209)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3611)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:554)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
	at com.databricks.unity.SAM.createDelegate(SAM.scala:215)
	at com.databricks.unity.SAM.createDelegate$(SAM.scala:208)
	at com.databricks.unity.ClusterDefaultSAM$.createDelegate(SAM.scala:250)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.createDelegate(CredentialScopeFileSystem.scala:85)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.$anonfun$setDelegates$2(CredentialScopeFileSystem.scala:151)
	at com.databricks.sql.acl.fs.Lazy.apply(DelegatingFileSystem.scala:310)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.globStatus(CredentialScopeFileSystem.scala:242)
	at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:276)
	at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:244)
	at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:332)
	at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:245)
	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:336)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:332)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:57)
	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:336)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:332)
	at org.apache.spark.api.java.JavaRDDLike.partitions(JavaRDDLike.scala:63)
	at org.apache.spark.api.java.JavaRDDLike.partitions$(JavaRDDLike.scala:63)
	at org.apache.spark.api.java.AbstractJavaRDDLike.partitions(JavaRDDLike.scala:46)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
	at py4j.Gateway.invoke(Gateway.java:306)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:119)
	at java.lang.Thread.run(Thread.java:750)
Caused by: Invalid configuration value detected for fs.azure.account.key
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.diagnostics.ConfigurationBasicValidator.validate(ConfigurationBasicValidator.java:49)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.diagnostics.Base64StringConfigurationBasicValidator.validate(Base64StringConfigurationBasicValidator.java:40)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.validateStorageAccountKey(SimpleKeyProvider.java:71)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:49)
	... 45 more
File <command-2091557935329574>, line 1
----> 1 model = something.load(abfss_url)
File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name)
    324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
    325 if answer[1] == REFERENCE_TYPE:
--> 326     raise Py4JJavaError(
    327         "An error occurred while calling {0}{1}{2}.\n".
    328         format(target_id, ".", name), value)
    329 else:
    330     raise Py4JError(
    331         "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n".
    332         format(target_id, ".", name, value))

 

i tested it with mounting the blob-container first and load the model from the local dbfs path. and that works.

so why is PipelineModel.load() ignoring the oauth settings compeltely?

1 ACCEPTED SOLUTION

Accepted Solutions

@CarstenWeber 
There's one thing worth trying - I had similiar issue when using Autoloader, even though spark.conf was set correctly, Autoloader was throwing the same error.
What I had to do is to set the secrets in the cluster configuration, not in the "notebook".

daniel_sahal_0-1713848118630.png

 

View solution in original post

4 REPLIES 4

daniel_sahal
Esteemed Contributor

CarstenWeber
New Contributor II

@daniel_sahal 
i already tried it out with the "longer" version for spark configs as mentioned in the article.

Tbh. for regular spark.read.load(path) commands both versions work just fine. I guess the one i used is a general conf. and the one in the article is fine-tuned to the exact ADLS. so you could access different ADLS endpoints with different credentials.

but the error still persists. so either spark conf results in the  "Invalid configuration value detected for fs.azure.account.key" error

@CarstenWeber 
There's one thing worth trying - I had similiar issue when using Autoloader, even though spark.conf was set correctly, Autoloader was throwing the same error.
What I had to do is to set the secrets in the cluster configuration, not in the "notebook".

daniel_sahal_0-1713848118630.png

 

CarstenWeber
New Contributor II

@daniel_sahal using the settings above did indeed work. 

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!