Databricks Community

CarstenWeber · ‎04-22-2024

Hi Community,
i was trying to load a ML Model from a Azure Storageaccount (abfss://....) with:

model = PipelineModel.load(path)

i set the spark config:

spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",)
spark.conf.set("fs.azure.account.oauth2.client.id", client_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", client_secret)
spark.conf.set("fs.azure.account.oauth2.client.endpoint","https://login.microsoftonline.com/<tenant_id>/oauth2/token")

and i always get the following error:

Py4JJavaError: An error occurred while calling o772.partitions.
: Failure to initialize configuration for storage account <storage>.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:52)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:682)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:2076)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.<init>(AzureBlobFileSystemStore.java:268)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:235)
	at com.databricks.common.filesystem.LokiABFS.initialize(LokiABFS.scala:36)
	at com.databricks.common.filesystem.LokiFileSystem$.$anonfun$getLokiFS$1(LokiFileSystem.scala:154)
	at com.databricks.common.filesystem.FileSystemCache.getOrCompute(FileSystemCache.scala:46)
	at com.databricks.common.filesystem.LokiFileSystem$.getLokiFS(LokiFileSystem.scala:151)
	at com.databricks.common.filesystem.LokiFileSystem.initialize(LokiFileSystem.scala:209)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3611)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:554)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
	at com.databricks.unity.SAM.createDelegate(SAM.scala:215)
	at com.databricks.unity.SAM.createDelegate$(SAM.scala:208)
	at com.databricks.unity.ClusterDefaultSAM$.createDelegate(SAM.scala:250)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.createDelegate(CredentialScopeFileSystem.scala:85)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.$anonfun$setDelegates$2(CredentialScopeFileSystem.scala:151)
	at com.databricks.sql.acl.fs.Lazy.apply(DelegatingFileSystem.scala:310)
	at com.databricks.sql.acl.fs.CredentialScopeFileSystem.globStatus(CredentialScopeFileSystem.scala:242)
	at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:276)
	at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:244)
	at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:332)
	at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:245)
	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:336)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:332)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:57)
	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:336)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:332)
	at org.apache.spark.api.java.JavaRDDLike.partitions(JavaRDDLike.scala:63)
	at org.apache.spark.api.java.JavaRDDLike.partitions$(JavaRDDLike.scala:63)
	at org.apache.spark.api.java.AbstractJavaRDDLike.partitions(JavaRDDLike.scala:46)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
	at py4j.Gateway.invoke(Gateway.java:306)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:119)
	at java.lang.Thread.run(Thread.java:750)
Caused by: Invalid configuration value detected for fs.azure.account.key
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.diagnostics.ConfigurationBasicValidator.validate(ConfigurationBasicValidator.java:49)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.diagnostics.Base64StringConfigurationBasicValidator.validate(Base64StringConfigurationBasicValidator.java:40)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.validateStorageAccountKey(SimpleKeyProvider.java:71)
	at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:49)
	... 45 more
File <command-2091557935329574>, line 1
----> 1 model = something.load(abfss_url)
File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name)
    324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
    325 if answer[1] == REFERENCE_TYPE:
--> 326     raise Py4JJavaError(
    327         "An error occurred while calling {0}{1}{2}.\n".
    328         format(target_id, ".", name), value)
    329 else:
    330     raise Py4JError(
    331         "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n".
    332         format(target_id, ".", name, value))

i tested it with mounting the blob-container first and load the model from the local dbfs path. and that works.

so why is PipelineModel.load() ignoring the oauth settings compeltely?

daniel_sahal · ‎04-22-2024

@CarstenWeber
There's one thing worth trying - I had similiar issue when using Autoloader, even though spark.conf was set correctly, Autoloader was throwing the same error.
What I had to do is to set the secrets in the cluster configuration, not in the "notebook".

View solution in original post

daniel_sahal · ‎04-22-2024

@CarstenWeber
spark.conf should be a little bit different.
See documentation here: https://learn.microsoft.com/en-us/azure/databricks/connect/storage/azure-storage#azureserviceprincip...

CarstenWeber · ‎04-22-2024

@daniel_sahal
i already tried it out with the "longer" version for spark configs as mentioned in the article.

Tbh. for regular spark.read.load(path) commands both versions work just fine. I guess the one i used is a general conf. and the one in the article is fine-tuned to the exact ADLS. so you could access different ADLS endpoints with different credentials.

but the error still persists. so either spark conf results in the "Invalid configuration value detected for fs.azure.account.key" error

daniel_sahal · ‎04-22-2024

@CarstenWeber
There's one thing worth trying - I had similiar issue when using Autoloader, even though spark.conf was set correctly, Autoloader was throwing the same error.
What I had to do is to set the secrets in the cluster configuration, not in the "notebook".

CarstenWeber · ‎04-23-2024

@daniel_sahal using the settings above did indeed work.

chhavibansal · ‎06-26-2024

@daniel_sahal I am facing the same error but i have a multi-tenant application, ie if I set the cluster level config and multiple clients are operating that cluster then I can run into a race condition. Is there a way to not put in the cluster configuration, and get it working.

daniel_sahal · ‎06-26-2024

@chhavibansal
Having storage keys setup on init script or even in the notebook is obsolete. I would suggest switching to Unity Catalog and setup volumes/external locations.

chhavibansal · ‎06-26-2024

@daniel_sahal what if Unity catalog is not an option to be used for my service, missing integration lets say, how to solve it in that case?

daniel_sahal · ‎06-27-2024

@chhavibansal Unfortunately i don't see any other way

chhavibansal · ‎06-27-2024

@daniel_sahal any possible reason you know of why it works in OSS spark while it does not work in databricks notebook ? Why is there a disparity.

Databricks Community

Invalid configuration fs.azure.account.key trying to load ML Model with OAuth

Join Us as a Local Community Builder!

PSA: Community Edition retires at the end of 2025 - move to Free Edition today to keep your work.

🎤 Call for Presentations: Data + AI Summit 2026 is Open!

Last Chance: Help Shape the 2026 Data + AI Summit | Win a Full Conference Pass

🌟 Community Pulse: Your Weekly Roundup! December 05 – 11, 2025

Jaipur Usergroup First Virtual Meetup: AI/BI Genie + Data Science Careers — 19 Dec | 6 PM IST