05-03-2022 06:35 PM
I realise this is not an optimal configuration but I'm trying to pull together a POC and I'm not at the point that I wish to ask the AAD admins to create an application for OAuth authentication.
I have been able to use direct references to the ADLS container but when I try to mount the container I get a java.lang.NullPointerException: authEndpoint error. While I searched I couldn't find any examples of actually mounting using just the storage account key. Is this possible and if so, how do I do it?
Regards,
Ashley
05-03-2022 07:00 PM
Here's a helper function I use:
def mount_blob_storage(blob_uri: str, secret: str, mnt_point: str) -> str:
""" Mount Azure Blob onto databricks.
References:
- Mounting blob storage: https://docs.databricks.com/data/data-sources/azure/azure-storage.html
Parameters
-----------
blob_uri: str
uri to blob storage container
secret: str
blob secret key for access
mnt_point: str
mount point in case do not want to use name of container
Return
--------
mnt: str
path to mounted storage in Databricks File System
"""
# Get container and account
container, account = get_storage_account_container(blob_uri)
# Define mount point
mnt = "/mnt/{}".format(mnt_point)
dbutils.fs.mount(
source = blob_uri,
mount_point = mnt,
extra_configs = {
"fs.azure.account.key.{}.blob.core.windows.net".format(account):
secret
})
return mnt
While the reference points to instructions to mount a blob storage account, the same method should work for ADLS.
05-03-2022 07:00 PM
Here's a helper function I use:
def mount_blob_storage(blob_uri: str, secret: str, mnt_point: str) -> str:
""" Mount Azure Blob onto databricks.
References:
- Mounting blob storage: https://docs.databricks.com/data/data-sources/azure/azure-storage.html
Parameters
-----------
blob_uri: str
uri to blob storage container
secret: str
blob secret key for access
mnt_point: str
mount point in case do not want to use name of container
Return
--------
mnt: str
path to mounted storage in Databricks File System
"""
# Get container and account
container, account = get_storage_account_container(blob_uri)
# Define mount point
mnt = "/mnt/{}".format(mnt_point)
dbutils.fs.mount(
source = blob_uri,
mount_point = mnt,
extra_configs = {
"fs.azure.account.key.{}.blob.core.windows.net".format(account):
secret
})
return mnt
While the reference points to instructions to mount a blob storage account, the same method should work for ADLS.
05-03-2022 09:32 PM
Perfect, thanks mradassaad. It didn't work for initially as I was using 'abfss:' protocol and .dfs.core.windows.net (instead of .blob.core.windows.net) from another example. Once I changed these it worked perfectly.
05-04-2022 12:22 AM
The problem is that you will use it as regular blob storage in such a case. ADLS, to use all its advantages of it, require two protocols. Here I explained that (that's why two private links are required):
https://community.databricks.com/s/feed/0D53f00001eQGOHCA4
Maybe just send that manual to the admin to do that correctly. Alternatively, you can use the active directory passthrough in the premium version.
05-05-2022 06:07 PM
Thanks Hubert, I've bookmarked your article for future reference.
05-04-2022 09:41 AM
Hey there @Ashley Betts
Thank you for posting your question. And you found the solution.
This is awesome!
Would you be happy to mark the answer as best so that other members can find the solution more quickly?
Cheers!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group