Stay up-to-date with the latest announcements from Databricks. Learn about product updates, new features, and important news that impact your data analytics workflow.
Fueled by the exponential growth in external data and AI for innovation, organizations across all industries are looking for effective ways to collaborate with their partners in a privacy-safe way. Some organizations have limited collaborative solutions and are often required to relinquish control over how their sensitive data is shared with little to no visibility into how their data is consumed. This creates a significant risk for potential data misuse and data privacy breaches.
Organizations need an open, flexible, yet privacy-safe way to collaborate and do AI on data, andDatabricks Clean Roomsmeets these critical needs. As we recentlyannounced at the Data + AI Summitthis year, Clean Rooms is in Public Preview in AWS and Azure (Request access to preview here). Clean Rooms is powered byDelta Sharingand allows businesses to easily collaborate with their customers and partners on any cloud without compromising privacy or sharing sensitive data. Participants in a clean room can securely share and join their existing data, and run complex workloads using any language — such as Python, which provides native support for ML. When collaborating in a clean room, your data stays in place and you are always in control of where and how the data is being used.
Databricks Clean Rooms is built for enterprises that are looking at ways to help accelerate innovation with data-driven insights. For example, watch the recent Data + AI Summit session,“Collaboration with Databricks Clean Rooms and PETs”to hear from Mastercard and learn more about how they protect sensitive data by dynamically determining which privacy-enhancing technologies (PETs) to use based on their collaborators, data, and use cases.
Any language, any workload
Databricks Clean Rooms is built for any analytics and AI workload. Unlike many other existing solutions that limit functionality to only SQL queries on tabular data, Databricks Clean Rooms allows you to run your computations in Python. Having this flexibility helps enable both simple joins as well as complex computations for ML/AI use cases. Leveraging the full power ofDatabricks Notebooks, you can run SQL or Python for complex compute and ML/AI workloads. Collaborators can also use private libraries to keep sensitive algorithms or data processing logic hidden, which ensures your IP remains protected. Finally, more language support is on the way for Scala and Java coming soon.
Any cloud, with no replication
Databricks Clean Rooms is built for collaboration across regions, clouds, and platforms. For example, collaborators from different clouds — such as one from AWS and another from Azure — can collaborate together with Databricks Clean Rooms. This secure, open, flexible collaboration with Clean Rooms is powered by Delta Sharing. You can collaborate on all your data and AI, including non-tabular or unstructured data and AI models — all while protecting the privacy of the underlying data.
Coming soon, collaboration across data platforms using the new Sharing for Lakehouse Federation feature from Delta Sharing (Request access to preview here).
Any scale, any trust level
We understand the critical need for organizations to use clean rooms at scale. Databricks Clean Rooms offers robust collaboration and operational capabilities to meet this demand.
Coming soon, with support for APIs, SQL commands, and built-in Databricks Workflows orchestration, you can easily automate and manage clean rooms for all your use cases. Multiple collaborators can work together in a Databricks Clean Room at different trust levels using different approval modes. You can also easily access your Clean Rooms outputs in Databricks Notebooks or in your Unity Catalog, enabling seamless integration into subsequent workflows.
How does Databricks Clean Rooms work?
Even though Clean Rooms is a powerful tool, it is easy to set up and get started.
First, you create a clean room by selecting your preferred cloud provider and region. The clean room can be created in any cloud or region, regardless of whichever you and your collaborators currently use. This creates a privacy-safe and isolated environment hosted by Databricks. Once the clean room is created, you and your collaborators can bring in your data — including unstructured data, tables, volumes and AI models — into the clean room using Delta Sharing. None of the participants in the clean room will be able to see or directly access each other’s data.
Finally, to perform an analysis, you can create a notebook with mutually agreed upon code and share this in the clean room. Then, your collaborator can run these notebook tasks which will be completed using serverless compute. Databricks Clean Rooms allows any collaborator to share a notebook into the clean room, have it approved, and then run it inside the clean room. This flexibility enables you to run any workload in a privacy-safe manner.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.