11-06-2022 03:50 AM
Hi team,
Users are unable run select on data located on S3 buckets, S3 permission are ok.
The only way they manage do it by granted the databricks workspace admin permission.
Attached the error.
Thanks!
11-09-2022 01:49 AM
Hmm,
let's try something like this.
When you create new cluster you can click on the `UI Preview` and `Legacy UI is enabled`
chose Cluster mode: High Concurrency
in Advanced Options:
Table Access Control - Enable:
on the right side you can switch to JSON and see what I have:
{
"autoscale": {
"min_workers": 2,
"max_workers": 8
},
"cluster_name": "Pat's Cluster",
"spark_version": "10.4.x-scala2.12",
"spark_conf": {
"spark.databricks.cluster.profile": "serverless",
"spark.databricks.repl.allowedLanguages": "python,sql",
"spark.databricks.acl.dfAclsEnabled": "true"
},
"aws_attributes": {
"first_on_demand": 1,
"availability": "SPOT_WITH_FALLBACK",
"zone_id": "auto",
"spot_bid_price_percent": 100,
"ebs_volume_type": null,
"ebs_volume_count": null,
"ebs_volume_size": null
},
"node_type_id": "i3.2xlarge",
"ssh_public_keys": [],
"custom_tags": {
"ResourceClass": "Serverless"
},
"spark_env_vars": {
"PYSPARK_PYTHON": "/databricks/python3/bin/python3"
},
"autotermination_minutes": 0,
"enable_elastic_disk": true,
"cluster_source": "UI",
"init_scripts": [],
"data_security_mode": null,
"runtime_engine": "STANDARD"
}
11-07-2022 04:41 AM
Hi @Avi Edri ,
it looks like your users are missing SELECT ANY FILE permission (which admins are granted by default), please see here for more details:
https://docs.databricks.com/security/access-control/table-acls/object-privileges.html
I assume you are not using Unity Catalog. It's not easy to achieve both without Unity Catalog - access data through Tables and through file path (spark.read... ).
You might need to re-visit data access on your side. I do believe when you have Table Access Control enabled cluster then you are limited to use tables - select * from some_table, unless you have permission to SELECT ANY FILE, then you can bypass this restriction.
Unity Catalog is way forward, it enables more security and allows some flexibility here.
thanks,
Patryk.
11-08-2022 08:25 AM
Thank you Pat,
Can you please guide me how do i grant the ANY FILE permission to my users or groups?
Also is there a way grant select to all db's via mysql command or terminal?
we are not using unity catalog and our table permission policy is enabled
Avi
11-08-2022 08:32 AM
This will work with sql (notebook):
GRANT SELECT ON ANY FILE TO `group-name`
or maybe this with terraform:
resource "databricks_sql_permissions" "any_file" {
any_file = true
privilege_assignments {
principal = "group-name"
privileges = ["SELECT"]
}
}
I didn't try the terraform one.
thanks,
Pat.
11-08-2022 09:53 AM
Thanks you Pat,
When running this query im getting the below exception:
rror in SQL statement: SparkException: Trying to perform permission action on Hive Metastore /ANY_FILE but Table Access Control is not enabled on this cluster.
I verified and my workspace settings are with enable table access control is enabled.
11-08-2022 11:35 AM
You need to use cluster with TAC enabled.
https://docs.databricks.com/security/access-control/table-acls/table-acl.html
there were some changes to the UI recently, you can follow instructions here.
https://docs.databricks.com/clusters/cluster-ui-preview.html
thanks,
Pat.
11-08-2022 11:39 PM
11-09-2022 01:49 AM
Hmm,
let's try something like this.
When you create new cluster you can click on the `UI Preview` and `Legacy UI is enabled`
chose Cluster mode: High Concurrency
in Advanced Options:
Table Access Control - Enable:
on the right side you can switch to JSON and see what I have:
{
"autoscale": {
"min_workers": 2,
"max_workers": 8
},
"cluster_name": "Pat's Cluster",
"spark_version": "10.4.x-scala2.12",
"spark_conf": {
"spark.databricks.cluster.profile": "serverless",
"spark.databricks.repl.allowedLanguages": "python,sql",
"spark.databricks.acl.dfAclsEnabled": "true"
},
"aws_attributes": {
"first_on_demand": 1,
"availability": "SPOT_WITH_FALLBACK",
"zone_id": "auto",
"spot_bid_price_percent": 100,
"ebs_volume_type": null,
"ebs_volume_count": null,
"ebs_volume_size": null
},
"node_type_id": "i3.2xlarge",
"ssh_public_keys": [],
"custom_tags": {
"ResourceClass": "Serverless"
},
"spark_env_vars": {
"PYSPARK_PYTHON": "/databricks/python3/bin/python3"
},
"autotermination_minutes": 0,
"enable_elastic_disk": true,
"cluster_source": "UI",
"init_scripts": [],
"data_security_mode": null,
"runtime_engine": "STANDARD"
}
11-09-2022 03:21 AM
11-09-2022 03:41 AM
I think that you always need to add SECURABLE_OBJECT
https://docs.databricks.com/sql/language-manual/security-show-grant.html
SHOW GRANTS [ principal ] ON securable_object
11-09-2022 04:07 AM
Great,
Thank you Pat!
11-08-2022 01:34 PM
@Avi Edri adding some more info to @Pat Sienkiewicz suggestion, @Avi Edri are you using cluster with instance profile, if you are using instance profile configured, please validate read permissions are there on that bucket and instance profile assigned cluster is enabled for user
11-09-2022 01:40 AM
Hi @karthik p
Yes, all relevant S3 bucket permission for this user is set
Thanks!
11-09-2022 01:41 AM
Thanks ! @Kaniz Fatma
Will update as soon as my issue resolved.
Avi
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group