cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Azure Databricks Unity Catalog - Cannot access Managed Volume in notebook

AlbertWang
Contributor III

The problem

After setting up Unity Catalog and a managed Volume, I can upload/download files to/from the volume, on Databricks Workspace UI.

However, I cannot access the volume from notebook. I created an All-purpose compute, and run dbutils.fs.ls("/Volumes/catalog1/schema1/volumn11"). Then I got the error

Operation failed: "This request is not authorized to perform this operation.", 403, GET

How we set up Unity Catalog and Managed Volume

  1. I am the Azure Databricks Account Admin, Metastore Admin, and Workspace Admin
  2. I created an Azure Databricks Workspace (Premium Tier)
  3. I created a Databricks Metastore, named metastore1
  4. I created an Azure ADSL Gen2 (storage account with Hierarchical namespace enabled), named adsl_gen2_1
  5. I created an Azure Access Connector for Azure Databricks (as an Azure Managed Identity), named access_connector_for_dbr_1
  6. In the adsl_gen2_1, I assigned the roles Storage Blob Data Contributor and Storage Queue Data Contributor to the access_connector_for_dbr_1
  7. I created two ADSL Gen2 containers under adsl_gen2_1
    • One named adsl_gen2_1_container_catalog_default
    • Another one named adsl_gen2_1_container_schema1
  8. I created a Databricks Storage Credentials, named dbr_strg_cred_1
    • The connector id is the resource id of access_connector_for_dbr_1
    • The Permissions of the Storage Credentials were not set (empty)
  9. I created two Databricks External Locations, both use the dbr_strg_cred_1
    • One external location named dbr_ext_loc_catalog_default, points to the ADSL Gen2 Container adsl_gen2_1_container_catalog_default, and the Permissions of this External Location were not set (empty)
    • Another one named dbr_ext_loc_schema1, points to the ADSL Gen2 Container adsl_gen2_1_container_schema1, and the Permissions of this External Location were not set (empty)
  10. I created a Databricks Catalog, named catalog1, under metastore1, and set dbr_ext_loc_catalog_default as this catalog's Storage Location
  11. I created a Databricks Schema, named schema1, under catalog1, and set dbr_ext_loc_schema1 as this schema's Storage Location
  12. I created a Databricks Volume, named volumn11, under schema1.
  13. On Databricks UI, I can upload files to the volume and download files from the volume11
  14. However, when I created an All-purpose compute, and run the below Python codes, I always got the error "Operation failed: "This request is not authorized to perform this operation.", 403, GET".
    • dbutils.fs.ls("/Volumes/catalog1/schema1/volumn11")
    • dbutils.fs.ls("dbfs:/Volumes/catalog1/schema1/volumn11")
    • spark.read.format("csv").option("header","True").load("/Volumes/catalog1/schema1/volumn11/123.csv")
    • spark.read.format("csv").option("header","True").load("dbfs:/Volumes/catalog1/schema1/volumn11/123.csv")

Details about the All-purpose compute

  • Type: Single node
  • Access mode: Single user
  • Single user access: myself
  • Runtime version: 14.3 LTS
  • Enable credential passthrough for user-level data access: disabled
1 ACCEPTED SOLUTION

Accepted Solutions

AlbertWang
Contributor III

I found the reason and a solution, but I feel this is a bug. And I wonder what is the best practice.

When I enable the ADSL Gen2's Public network access from all networks as shown below, I can access the volume from a notebook.

1.png

However, if I enable the ADSL Gen2's Public network access from selected virtual networks and IP addresses as shown below, I cannot access the volume from a notebook. Even though I added the VM's public IP to the whitelist, added the resource Microsoft.Databricks/accessConnectors to the resource instances, and enabled the Exceptions Allow Azure services on the trusted services list to access this storage account. As I understand, my compute has the Unity Catalog badge, it should access the ADSL Gen2 via the Access Connector for Databricks (Managed Identity), so it should be able to access the ADSL Gen2 via the Access Connector for Databricks.

2.png3.png

View solution in original post

2 REPLIES 2

AlbertWang
Contributor III

I found the reason and a solution, but I feel this is a bug. And I wonder what is the best practice.

When I enable the ADSL Gen2's Public network access from all networks as shown below, I can access the volume from a notebook.

1.png

However, if I enable the ADSL Gen2's Public network access from selected virtual networks and IP addresses as shown below, I cannot access the volume from a notebook. Even though I added the VM's public IP to the whitelist, added the resource Microsoft.Databricks/accessConnectors to the resource instances, and enabled the Exceptions Allow Azure services on the trusted services list to access this storage account. As I understand, my compute has the Unity Catalog badge, it should access the ADSL Gen2 via the Access Connector for Databricks (Managed Identity), so it should be able to access the ADSL Gen2 via the Access Connector for Databricks.

2.png3.png

I had this exact issue though for me the problem was I had not configured private endpoints for the "dfs" and "queue" services, only for "blob". Once I added the missing private endpoints I could list and write to the catalog from a notebook without issues.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group