06-07-2023 04:57 PM
My workspace has a couple different types of clusters, and I'm having issues using the `dbutils` filesystem utilities when connected to a shared cluster. I'm hoping you can help me fix the configuration of the shared cluster so that I can actually use the dbutils filesystem commands.
The workspace is set up to use Unity Catalog, and I'm not sure if that has anything to do with the error.
When I try to `ls` the DBFS root location I get an "INSUFFICIENT_PERMISSIONS" Spark security exception:
The cluster this happens on is a Shared cluster with the data security mode set to "USER_ISOLATION" (by Terraform). It says Unrestricted in the screen shot below, but we set the data security mode in Terraform.
This error does not occur on a Single User cluster with the Individual use policy:
Can you give me guidance on how to configure the shared cluster so that `dbutils.fs.ls("/")` won't error with insufficient permissions?
Thanks you so much!
06-16-2023 12:08 AM
Hi @Spencer Kent
We haven't heard from you since the last response from @Kaniz Fatma , and I was checking back to see if her suggestions helped you.
Or else, If you have any solution, please share it with the community, as it can be helpful to others.
Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.
06-16-2023 04:48 PM
Unfortunately the suggestion was not helpful—it is no mystery what the error is (insufficient permissions to access the DBFS root location). What remains a mystery, and the point of my question, is whether there is a certain configuration of shared clusters that is required in order to make the DBFS root location accessible. I ask this question here in this forum because I have not yet found in the Databricks documentation a discussion of this question, and because the terraform provider I'm using does not list any cluster configuration options which seem relevant to making sure the DBFS root location (or any DBFS filesystem utils) are available on a shared cluster.
12-22-2023 02:57 AM
Hi @Spencer_Kent ,
Please go through this , https://learn.microsoft.com/en-us/azure/databricks/dbfs/unity-catalog
Shared access mode combines Unity Catalog data governance with Azure Databricks legacy table ACLs. Access to data in the hive_metastore
is only available to users that have permissions explicitly granted.
To interact with files directly using DBFS, you must have ANY FILE
permissions granted. Because ANY FILE
allows users to bypass legacy tables ACLs in the hive_metastore
and access all data managed by DBFS, Databricks recommends caution when granting this privilege.
Clusters configured with Single User access mode have full access to DBFS, including all files in the DBFS root and mounted data. DBFS root and mounts are available in this access mode, making it the choice for ML workloads that need access to Unity Catalog datasets.
Databricks recommends using service principals with scheduled jobs and Single User access mode for production workloads that need access to data managed by both DBFS and Unity Catalog.
01-17-2024 02:13 AM
Hello,
I'm encountering a similar issue. We have a team of researchers utilizing a shared cluster without access to the Hive Metastore. I've looked through the documentation, but there doesn't seem to be a way to define or grant "ANY_FILE" during the cluster initialization process.
Moreover, what if Im looking to access the s3 bucket path itself , what is the approach to define it?
please advise
01-23-2024 05:15 PM
I could not find `ANY FILE` permission as well.
01-31-2024 02:15 AM
There are two ways to grant access to DBFS using ANY FILE:
07-01-2024 01:40 PM
I have the same problem that I am receiving "No such file or directory" when trying to access a file in DBFS using a cluster in shared access mode. I am using shared access mode because I want to use table access controls.
I ran "GRANT ALL PRIVILEGES ON ANY FILE TO <user>" and "SHOW GRANTS ON ANY FILE" to verify the grant went through, but am still unable to access the file in DBFS.
@Retired_mod could you advise? Thanks!
07-01-2024 11:02 PM
You can't access files on DBFS mounts using a Shared cluster. Either use a Unity Catalog Volume or use a Single user cluster.
07-02-2024 08:11 AM
@jacovangelder We are using Azure Gov so don't have access to Unity Catalog. What would you suggest we do to control certain users' access to data, since you can't use table access controls with Single user clusters?
07-02-2024 10:44 PM
Can you not use a No Isolation Shared cluster with Table access controls enabled on workspace level?
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group