Databricks Clean Rooms are secure & governed collaboration environments that enable various organizations to run joint analytics without exchanging raw data eliminating sensitive data exposure. Its built-on Delta Sharing, Serverless and Unity Catalog - it enforces policy-driven access, comprehensive audit trails and strict controls. Collaborators derive shared insights keeping sensitive information locked within their own cloud boundaries.
Research Agency Collaboration
Utilized Clean rooms for a health research agency that partnered with a regional hospital system to improve readmission prediction for heart patients. Collaboration was bidirectional — the agency contributed de-identified social determinants of health data (zip-code-level income brackets, transportation access scores and medication patterns from surveys). The hospital contributed de-identified hospital data (diagnosis codes, discharge summaries, lab trends and prior utilization history). Neither organization shared raw records. Using a Clean Room, we jointly trained a risk stratification model that identified high-risk patients 14 days earlier than prior methods enabling proactive care coordination while maintaining HIPAA compliance. Notably, the EHR data from the hospitals resided in Snowflake. Queried various approved objects (Databricks & Snowflake) directly alongside the agency’s Databricks hosted data with zero ETL & unified governance.
Objects (Metadata) Visible to Both Parties:
- Tables: community metrics, patient trajectory
- Column: patient hash, zip code, readmission
- Notebooks: doh analysis (mutual approval required)
- Temporary outputs: risk score distribution (auto-expiring & read only)
Both parties had read only access to de-identified tables. Approved notebooks executed in serverless compute with Full audit logs of all queries and approvals.
Hospital gained validated & privacy-preserving risk model to target care management. Agency gained real validation of SDOH predictors accelerating health insights.