Data Access Control in Databricks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2022 12:56 AM
Best Practices for Securing Access to Data in Databricks
Unity Catalog is the unified governance solution for Data & AI assets in Databricks and greatly simplifies and centralized data access control. This guide includes best practices for both the streamlined approach with Unity Catalog as well as the approach without Unity Catalog.
Data Access Control with Unity Catalog
Unity Catalog elevates access to files, databases, tables, rows, and columns and more to the metastore level rather than the cluster level and allows you to set and users, groups, and permissions across workspaces.
Continued below
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2022 12:57 AM
- To enable a workspace for Unity Catalog:
- Create an S3 bucket and IAM role (AWS | GCP) or Access Connector (Azure) that Unity Catalog will use as the default for managed tables (AWS | Azure | GCP)
- Create a metastore using that IAM role (AWS | GCP) or Access Connector (Azure) and attach that metastore to each of the workspace you would like have access to that metastore.
- For securing access to buckets, folders, and blobs in S3/ADLS/GCS:
- For access to data in the default S3/ADLS/GCS bucket/container:
- A Managed Storage Credential (AWS | Azure | GCP) was automatically created when the metastore was set up.
- Create an External Location (AWS | Azure | GCP) using that Managed Storage Credential to scope down access to the specific storage path within that bucket/container you want to grant access to.
- Grant access to that External Location to the groups that you want to be able to read/write or create tables on top of those S3/ADLS/GCS locations (AWS | Azure | GCP)
- For access to data in the default S3/ADLS/GCS bucket/container:
Continued Below
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2022 12:58 AM
- To enable a workspace for Unity Catalog: (see above)
- For security access to buckets, folders, and blobs in S3/ADLS/GCS: (see above)
- For access to data in the default S3/ADLS/GCS bucket/container: (see above)
- For access to data in external S3/ADLS/GCS buckets/containers:
- Create an IAM role (AWS | GCP) or Managed Identity (Azure) to provide access to this S3/ADLS/GCS bucket/container.
- Create a Storage Credential with that IAM role (AWS | GCP) or Managed Identity (Azure)
- Create an External Location (AWS | Azure | GCP) using that Managed Storage Credential to scope down access to the specific storage path within that bucket/container you want to grant access to.
- Grant access to that External Location to the groups that you want to read/write/create tables on top of to those S3/ADLS/GCS locations (AWS | Azure | GCP)
- For database, tables:
- Enable clusters and SQL warehouses to leverage Unity Catalog
- Fine-grained access control
Continued Below
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2022 01:23 AM
Data Access Control without Unity Catalog
Prior to Unity Catalog, data access was controlled at the cluster level using Table Access Controls.
- For securing access to buckets, folders, and blobs in S3/ADLS/GCS:
- Create an IAM role and instance profile (AWS) that has access to the to the AWS S3 buckets/folders you want to grant to a team, create a Service Principal for access to ADLS Gen2 containers/blobs (Azure), or use a Service Account to connect to a GCS bucket (GCP).
- Attach the instance profile to the DS&E cluster (AWS), mount the ADLS Gen2 container to the workspace using the Service Principal (Azure), or add the GCP Service Account email to the DS&E cluster (GCP).
- Use cluster entitlements (AWS | Azure | GCP) to turn off unrestricted cluster access to DS&E groups
- Provide access to that cluster or cluster policy using Cluster ACLs (AWS | Azure | GCP)
Continued Below
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-12-2023 12:09 AM
- For securing access to buckets, folders, and blobs in S3/ADLS/GCS: (see above)
- For database, tables:
- Fine-grained access control
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-12-2023 12:10 AM
Let us know if this walkthrough helped you set up data access control and let us know how your journey to leveraging Unity Catalog is going!

