cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Clean Rooms with 3 or more collaborators

pardeep7
New Contributor II

Let's say I create a clean room with 2 other collaborators, call them collaborator A and collaborator B (so 3 in total, including me) and then shared some tables to the clean room. If collaborator A writes code that does a "SELECT * FROM creator.<table>" (i.e. one of my tables), then would collaborator B essentially be able to see all of my potentially sensitive data? Or are there ways to prevent anyone writing queries like this?

Thanks in advance.

3 REPLIES 3

KaranamS
Contributor III

Hi @pardeep7 ,

Databricks Clean rooms uses Delta Sharing to share data between collaborators. This allows collaborators to share tables, volumes, and notebooks with the central clean room without exposing the underlying raw data directly to each other. Here is a reference documentation from databricks https://docs.databricks.com/aws/en/clean-rooms#how-does-clean-rooms-work

So, ideally, Collaborator B would not be able to see any sensitive data

Hope this helps!

pardeep7
New Contributor II

What i'm unsure about is that with the delta sharing you share the whole table right? So is there anything stopping Collaborator A writing a query on all the sensitive columns of the Creator's table and then Collaborator B can run that code, exposing Creator's sensitive data to Collaborator B?

KaranamS
Contributor III

Hi @pardeep7 , As per my understanding, all participants of clean room can only see metadata. The raw data in your tables is not directly accessed by other collaborators.

Any output tables created by Collaborators based on the queries/notebooks will be read-only and only the specific principal (user, group, or service principal) who runs the notebook has default read access to the output table. Also the notebooks created by one Collaborator cannot be run by other collaborators. So, in your case, if Collaborator A creates a notebook and an output table, only Collaborator A will have access to that notebook and table unless Collaborator B has been explicitly granted permission to access the output table.

References: https://docs.databricks.com/aws/en/clean-rooms#how-is-my-data-managed-in-a-clean-room

https://docs.databricks.com/aws/en/clean-rooms/clean-room-notebook#before-you-begin

 

 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now