Databricks Clean Rooms with 3 or more collaborators
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-24-2025 08:07 AM
Let's say I create a clean room with 2 other collaborators, call them collaborator A and collaborator B (so 3 in total, including me) and then shared some tables to the clean room. If collaborator A writes code that does a "SELECT * FROM creator.<table>" (i.e. one of my tables), then would collaborator B essentially be able to see all of my potentially sensitive data? Or are there ways to prevent anyone writing queries like this?
Thanks in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-24-2025 11:19 AM
Hi @pardeep7 ,
Databricks Clean rooms uses Delta Sharing to share data between collaborators. This allows collaborators to share tables, volumes, and notebooks with the central clean room without exposing the underlying raw data directly to each other. Here is a reference documentation from databricks https://docs.databricks.com/aws/en/clean-rooms#how-does-clean-rooms-work
So, ideally, Collaborator B would not be able to see any sensitive data
Hope this helps!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-24-2025 10:18 PM
What i'm unsure about is that with the delta sharing you share the whole table right? So is there anything stopping Collaborator A writing a query on all the sensitive columns of the Creator's table and then Collaborator B can run that code, exposing Creator's sensitive data to Collaborator B?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-25-2025 10:23 AM
Hi @pardeep7 , As per my understanding, all participants of clean room can only see metadata. The raw data in your tables is not directly accessed by other collaborators.
Any output tables created by Collaborators based on the queries/notebooks will be read-only and only the specific principal (user, group, or service principal) who runs the notebook has default read access to the output table. Also the notebooks created by one Collaborator cannot be run by other collaborators. So, in your case, if Collaborator A creates a notebook and an output table, only Collaborator A will have access to that notebook and table unless Collaborator B has been explicitly granted permission to access the output table.
References: https://docs.databricks.com/aws/en/clean-rooms#how-is-my-data-managed-in-a-clean-room
https://docs.databricks.com/aws/en/clean-rooms/clean-room-notebook#before-you-begin