cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Unity Catalog and Data Accessibility

SenthilJ
New Contributor III

Hi,

I got a few question about the internals of #Unity Catalog in #Databricks

1. Understand that we can customize the UC metastore at different levels (catalog/schema). Wondering where is the information about UC permission model stored for every data object (tables/views) in Databricks?

2. Assume the following scenario while using #Azure

  • Databricks Workspaces A and B are under the same region in the US and the same Databricks account registered to a Unity Catalog metastore called "uc-metastore-1". These two workspaces are separated out using their own VNets in Azure.
  • Workspace A connects to Azure ADLS ADL1 and workspace B connects to Azure ADLS ADL2 using their respective access connectors.
  • User X is part of the workspace A and user Y is part of the workspace B. 
  • User X created a data object "X-DB-Table1" and User Y created a data object "Y-DB-Table1" in their respective workspaces. Both are external delta tables from custom storage location
  • Metastore Admin grants User Y access to User X's data object "X-DB-Table1". After the assignment, the User Y is now able to query the table "X-DB-Table1" directly from his Workspace B

What happens under the hood when such querying happens?

  1. How does Workspace B query the table "X-DB-Table1" that's linked to Workspace A using it's own Access Connector. Because the data for "X-DB-Table1" is under the Workspace A network.
  2. Does Unity automatically elevate the privileges of Workspace B to allow access to Workspace A's access connector?

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @SenthilJ

 

Unity Catalog Permission Model: The Unity Catalog permission model is based on standard ANSI SQL and.... Privileges can be granted by either a metastore admin, the owner of an object, or the owner of the c.... If your workspace was enabled for Unity Catalog automatically, the workspace is attached to a metast.... You can manage privileges for metastore objects by using SQL commands, Unity Catalog CLI (legacy), o....

 

Data Access Across Workspaces: In Databricks, a metastore is the top-level container for data in Unity Catalog. Each metastore exposes a 3-level namespace (catalog.schema.table) by which data can be organized. You can share a single metastore across multiple Databricks workspaces in an account. Each linked workspace has the same view of the data in the metastore, and you can manage data access.... You can create one metastore per region and attach it to any number of workspaces in that region.

 

When User Y queries the table “X-DB-Table1” from Workspace B, the query is executed against the data object registered in the shared Unity Catalog metastore. The actual data for “X-DB-Table1” is stored in Azure ADLS ADL1, which Workspace A connects to. The Unity Catalog manages the metadata and permissions, allowing User Y to access the data object “X-DB-Table1” even though the data resides in Workspace A’s network. It’s important to note that the data access is managed by the Unity Catalog and not by elevating the privileges of Workspace B to access Workspace A’s connector.

 

Let me know if that helps.

View solution in original post

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @SenthilJ

 

Unity Catalog Permission Model: The Unity Catalog permission model is based on standard ANSI SQL and.... Privileges can be granted by either a metastore admin, the owner of an object, or the owner of the c.... If your workspace was enabled for Unity Catalog automatically, the workspace is attached to a metast.... You can manage privileges for metastore objects by using SQL commands, Unity Catalog CLI (legacy), o....

 

Data Access Across Workspaces: In Databricks, a metastore is the top-level container for data in Unity Catalog. Each metastore exposes a 3-level namespace (catalog.schema.table) by which data can be organized. You can share a single metastore across multiple Databricks workspaces in an account. Each linked workspace has the same view of the data in the metastore, and you can manage data access.... You can create one metastore per region and attach it to any number of workspaces in that region.

 

When User Y queries the table “X-DB-Table1” from Workspace B, the query is executed against the data object registered in the shared Unity Catalog metastore. The actual data for “X-DB-Table1” is stored in Azure ADLS ADL1, which Workspace A connects to. The Unity Catalog manages the metadata and permissions, allowing User Y to access the data object “X-DB-Table1” even though the data resides in Workspace A’s network. It’s important to note that the data access is managed by the Unity Catalog and not by elevating the privileges of Workspace B to access Workspace A’s connector.

 

Let me know if that helps.

SenthilJ
New Contributor III

thank you @Kaniz ,your response really helps. A quick follow up - when Unity Catalog uses its permissions to access objects across workspaces, what kind of connection method does it use to access the data object i.e. in this case, when User Y queries the table “X-DB-Table1” from Workspace B ? Also, where is the Unity Catalog's permission metadata (in metastore) physically stored - in Control Plane?

Kaniz
Community Manager
Community Manager

Hi @SenthilJ

Unity Catalog manages access to data and other objects across workspaces. Access can be granted by either a metastore admin, an object's owner, or the catalog or schema that .... When User Y queries the table “X-DB-Table1” from Workspace B, the Unity Catalog checks the permissions set for User Y on that specific table. If User Y has the necessary permissions, the query is executed.

 

As for the storage of Unity Catalog’s permission metadata, it is stored in the metastore. A metastore is the top-level container of objects in Unity Catalog. It registers metadata about data and AI assets and the permissions governing access. This includes information about tables, volumes, external locations, and shares. So, the Unity Catalog’s permission metadata is not physically stored in the Control Plane but in the metastore.

 

Please note that initially, users have no access to data in a metastore. Access can be granted by either a metastore admin, an object's owner, or the catalog or schema that ...The metastore admin is a highly privileged user or group in Unity CatalogThey can manage the privileges or transfer ownership of any object within the metastore, including s...They can also grant themselves read and write access to any data in the metastore.

I hope this answers your questions. If you have any more questions, feel free to ask! 😊

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.