- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-08-2024 02:42 PM
The problem
After setting up Unity Catalog and a managed Volume, I can upload/download files to/from the volume, on Databricks Workspace UI.
However, I cannot access the volume from notebook. I created an All-purpose compute, and run dbutils.fs.ls("/Volumes/catalog1/schema1/volumn11"). Then I got the error
Operation failed: "This request is not authorized to perform this operation.", 403, GET
How we set up Unity Catalog and Managed Volume
- I am the Azure Databricks Account Admin, Metastore Admin, and Workspace Admin
- I created an Azure Databricks Workspace (Premium Tier)
- I created a Databricks Metastore, named metastore1
- I created an Azure ADSL Gen2 (storage account with Hierarchical namespace enabled), named adsl_gen2_1
- I created an Azure Access Connector for Azure Databricks (as an Azure Managed Identity), named access_connector_for_dbr_1
- In the adsl_gen2_1, I assigned the roles Storage Blob Data Contributor and Storage Queue Data Contributor to the access_connector_for_dbr_1
- I created two ADSL Gen2 containers under adsl_gen2_1
- One named adsl_gen2_1_container_catalog_default
- Another one named adsl_gen2_1_container_schema1
- I created a Databricks Storage Credentials, named dbr_strg_cred_1
- The connector id is the resource id of access_connector_for_dbr_1
- The Permissions of the Storage Credentials were not set (empty)
- I created two Databricks External Locations, both use the dbr_strg_cred_1
- One external location named dbr_ext_loc_catalog_default, points to the ADSL Gen2 Container adsl_gen2_1_container_catalog_default, and the Permissions of this External Location were not set (empty)
- Another one named dbr_ext_loc_schema1, points to the ADSL Gen2 Container adsl_gen2_1_container_schema1, and the Permissions of this External Location were not set (empty)
- I created a Databricks Catalog, named catalog1, under metastore1, and set dbr_ext_loc_catalog_default as this catalog's Storage Location
- I created a Databricks Schema, named schema1, under catalog1, and set dbr_ext_loc_schema1 as this schema's Storage Location
- I created a Databricks Volume, named volumn11, under schema1.
- On Databricks UI, I can upload files to the volume and download files from the volume11
- However, when I created an All-purpose compute, and run the below Python codes, I always got the error "Operation failed: "This request is not authorized to perform this operation.", 403, GET".
- dbutils.fs.ls("/Volumes/catalog1/schema1/volumn11")
- dbutils.fs.ls("dbfs:/Volumes/catalog1/schema1/volumn11")
- spark.read.format("csv").option("header","True").load("/Volumes/catalog1/schema1/volumn11/123.csv")
- spark.read.format("csv").option("header","True").load("dbfs:/Volumes/catalog1/schema1/volumn11/123.csv")
Details about the All-purpose compute
- Type: Single node
- Access mode: Single user
- Single user access: myself
- Runtime version: 14.3 LTS
- Enable credential passthrough for user-level data access: disabled
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-08-2024 04:27 PM
I found the reason and a solution, but I feel this is a bug. And I wonder what is the best practice.
When I enable the ADSL Gen2's Public network access from all networks as shown below, I can access the volume from a notebook.
However, if I enable the ADSL Gen2's Public network access from selected virtual networks and IP addresses as shown below, I cannot access the volume from a notebook. Even though I added the VM's public IP to the whitelist, added the resource Microsoft.Databricks/accessConnectors to the resource instances, and enabled the Exceptions Allow Azure services on the trusted services list to access this storage account. As I understand, my compute has the Unity Catalog badge, it should access the ADSL Gen2 via the Access Connector for Databricks (Managed Identity), so it should be able to access the ADSL Gen2 via the Access Connector for Databricks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-08-2024 04:27 PM
I found the reason and a solution, but I feel this is a bug. And I wonder what is the best practice.
When I enable the ADSL Gen2's Public network access from all networks as shown below, I can access the volume from a notebook.
However, if I enable the ADSL Gen2's Public network access from selected virtual networks and IP addresses as shown below, I cannot access the volume from a notebook. Even though I added the VM's public IP to the whitelist, added the resource Microsoft.Databricks/accessConnectors to the resource instances, and enabled the Exceptions Allow Azure services on the trusted services list to access this storage account. As I understand, my compute has the Unity Catalog badge, it should access the ADSL Gen2 via the Access Connector for Databricks (Managed Identity), so it should be able to access the ADSL Gen2 via the Access Connector for Databricks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-03-2024 10:59 PM
I had this exact issue though for me the problem was I had not configured private endpoints for the "dfs" and "queue" services, only for "blob". Once I added the missing private endpoints I could list and write to the catalog from a notebook without issues.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2024 10:11 AM
Thank you for this answer! I had exactly the same issue and your post solved my problem.
It really shouldn't throw a 403 error if that is the issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-05-2024 06:54 AM
HTTP 403 is the correct response, as Databricks is forbidden from accessing the resource. You need to add your VNet to the allowlist for this to work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-05-2024 06:53 AM
No no no, don't do this! You should have your Databricks running in a VNet (ref: Deploy Azure Databricks in your Azure virtual network (VNet injection) - Azure Databricks | Microsof...).
You then select "Enabled from selected virtual networks and IP addresses" and add your VNet to the allowlist.
When you go to set up Serverless Compute, you will be given a list of VNets to add to this list, you will add those here also.

