ABFS Authentication with a SAS token -> 403!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2024 01:06 PM
Hi guys,
I'm running a streamReader/Writer with autoloader from StorageV2 (general purpose v2) over abfss instead of wasbs. My checkpoint location is valid, the reader properly reads the file schema and autoloader is able to sample 105 files to do so.
I have a valid SAS token with all permissions set, the storage is not behind a firewall and is open to access from all networks. However, whenever I try to access the storage location with abfss I get the following error:
(shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsRestOperationException) Operation failed: "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.", 403
The SAS token is set like this:
spark.conf.set("fs.azure.account.auth.type.<storage_account>.dfs.core.windows.net", "SAS")
spark.conf.set("fs.azure.sas.token.provider.type.<storage_account>.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider")
spark.conf.set("fs.azure.sas.fixed.token.<storage_account>.dfs.core.windows.net", <sas_token>)
I am currently running: 15.4 LTS (includes Apache Spark 3.5.0, Scala 2.12) with Azure Data Lake Storage credential passthrough enabled. Soft blob delete is disabled on the storage account and the SAS token has all the possible permissions.
The same operation and setup work with wasbs, leaving me wondering what could be the possible reasons and how to fix them. If anyone encountered this issue or knows how to solve it without using Azure Service Principal, I would appreciate the help. I've spent way too much time on this with no real solution.
- Labels:
-
Spark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-04-2024 03:14 AM
Resolved it with Service Principal.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-08-2024 11:37 PM
Would you mind to paste the sample code please. I am trying to use abfs with autoloader and getting error like yours.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-11-2024 01:06 AM
Hi BricksGuy,
So I created a service principal in the portal for my user which results in a client id, and secret. You also need the tenant_id.
Then you can set your spark options as below:
spark.conf.set(f"fs.azure.account.auth.type.{storage_account_name}.dfs.core.windows.net", "OAuth")
spark.conf.set(f"fs.azure.account.oauth.provider.type.{storage_account_name}.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set(f"fs.azure.account.oauth2.client.id.{storage_account_name}.dfs.core.windows.net", <sp_client_id>)
spark.conf.set(f"fs.azure.account.oauth2.client.secret.{storage_account_name}.dfs.core.windows.net", "<sp_secret>")
spark.conf.set(f"fs.azure.account.oauth2.client.endpoint.{storage_account_name}.dfs.core.windows.net", "https://login.microsoftonline.com/<tenant_id>/oauth2/token")
Make sure to use DFS and not Blob for the endpoint keys, otherwise spark will get confused and you'll get a similar problem with either the method not being allowed or the headers not set correctly.
Once this has executed, you can access your storage. To verify, I just list the dirs as below:
directories = dbutils.fs.ls(f"abfss://{container_name}@{storage_account_name}.dfs.core.windows.net/{main_path}")
It took me a couple of days to get from a standstill to here. I'm using the 14.3 Runtime. I found most online resources to work better with that runtime version.
Good luck and let me know if I can help you further.
Cheers

