- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-11-2024 01:06 AM
Hi BricksGuy,
So I created a service principal in the portal for my user which results in a client id, and secret. You also need the tenant_id.
Then you can set your spark options as below:
spark.conf.set(f"fs.azure.account.auth.type.{storage_account_name}.dfs.core.windows.net", "OAuth")
spark.conf.set(f"fs.azure.account.oauth.provider.type.{storage_account_name}.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set(f"fs.azure.account.oauth2.client.id.{storage_account_name}.dfs.core.windows.net", <sp_client_id>)
spark.conf.set(f"fs.azure.account.oauth2.client.secret.{storage_account_name}.dfs.core.windows.net", "<sp_secret>")
spark.conf.set(f"fs.azure.account.oauth2.client.endpoint.{storage_account_name}.dfs.core.windows.net", "https://login.microsoftonline.com/<tenant_id>/oauth2/token")Make sure to use DFS and not Blob for the endpoint keys, otherwise spark will get confused and you'll get a similar problem with either the method not being allowed or the headers not set correctly.
Once this has executed, you can access your storage. To verify, I just list the dirs as below:
directories = dbutils.fs.ls(f"abfss://{container_name}@{storage_account_name}.dfs.core.windows.net/{main_path}")It took me a couple of days to get from a standstill to here. I'm using the 14.3 Runtime. I found most online resources to work better with that runtime version.
Good luck and let me know if I can help you further.
Cheers