- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-27-2025 03:16 PM - edited 02-27-2025 03:23 PM
I have my own Autoloader repo and this repo is responsible for ingestion data from landing layer(ADLS) and load data into raw layer in Databricks. In that repo, I created a couple of workflows, and run these workflows with serverless cluster. and I use whl python package as dependent libraries in my tasks.
I have NCC connection but still getting error. Becuase I have a couple of spark configuration in this repo.
I set the following configurations in py file:
def set_storage_account_config(
storage_account: str,
secret_scope: str,
spn_tenant_id_key: str,
spn_client_id_key: str,
spn_client_secret_key: str,
) -> None:
"""
This function will fetch the SPN information from key vault using the provided key
names and scope and use it to configure Spark to the use this SPN when connection
to the given ADLS Gen2 storage account.
"""
logger.info(f"Setting spark config for storage account '{storage_account}'")
spn_tenant_id = dbutils.secrets.get(scope=secret_scope, key=spn_tenant_id_key)
spn_client_id = dbutils.secrets.get(scope=secret_scope, key=spn_client_id_key)
spn_client_secret = dbutils.secrets.get(scope=secret_scope, key=spn_client_secret_key)
spark.conf.set(
f"fs.azure.account.auth.type.{storage_account}.dfs.core.windows.net", "OAuth"
)
spark.conf.set(
f"fs.azure.account.oauth.provider.type.{storage_account}.dfs.core.windows.net",
"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
)
spark.conf.set(
f"fs.azure.account.oauth2.client.id.{storage_account}.dfs.core.windows.net",
spn_client_id,
)
spark.conf.set(
f"fs.azure.account.oauth2.client.secret.{storage_account}.dfs.core.windows.net",
spn_client_secret,
)
spark.conf.set(
f"fs.azure.account.oauth2.client.endpoint.{storage_account}.dfs.core.windows.net",
f"https://login.microsoftonline.com/{spn_tenant_id}/oauth2/token",
)
and this adls config in other py file:
def set_delta_table_properties(delta_table_properties: dict) -> None:
"""
This function will take a dictionary of delta table properties and set each of them
as spark session defaults. For a complete list check this:
https://docs.databricks.com/en/delta/table-properties.html. This function should be
called only once before loading any of the sources.
"""
logger.info("Setting spark session delta table properties")
logger.debug(f"Using this config '{delta_table_properties}'")
# Set the properties for the spark session
for k, v in delta_table_properties.items():
logger.debug(f"Setting 'spark.databricks.delta.properties.defaults.{k}' to '{v}'")
spark.sql(f"set spark.databricks.delta.properties.defaults.{k} = {v}")
However, when I perform the workflow on a serverless environment, I get the following error:
Error:
[CONFIG_NOT_AVAILABLE] Configuration fs.azure.account.auth.type.adlsxxxxxx.dfs.core.windows.net is not available. SQLSTATE: 42K0I
How can I access files stored in ADLS with serverless?
Thank you.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a month ago
The recommended approach for accessing cloud storage is to create Databricks storage credentials. These storage credentials can refer to entra service principals, managed identities, etc. After a credential is created, create an external location. When this is done, you will be able to access the ADLS location without any additional configuration.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a month ago
The recommended approach for accessing cloud storage is to create Databricks storage credentials. These storage credentials can refer to entra service principals, managed identities, etc. After a credential is created, create an external location. When this is done, you will be able to access the ADLS location without any additional configuration.

