4 weeks ago
I'm new to Databricks and have been tasked with exploring Databricks Clean Rooms. I'm a bit confused about how billing works for Clean Rooms and their overall functionality. Specifically, I'm curious about the following:
Environment Hosting: Are Clean Room environments hosted by Databricks?
Data Sharing: Do participants share data assets through Delta Sharing?
Compute Resources: Are notebooks run using serverless compute?
Billing Details: How are charges applied for Databricks Clean Rooms? Are costs incurred for Delta Sharing and serverless compute?
To investigate, I ran the following SQL query:
SELECT
usage.usage_type,
usage.usage_date,
DATE_FORMAT(FROM_UTC_TIMESTAMP(usage.usage_start_time, '<region>'), 'yyyy-MM-dd hh:mm:ss a') AS usage_start_time,
DATE_FORMAT(FROM_UTC_TIMESTAMP(usage.usage_end_time, '<region>'), 'yyyy-MM-dd hh:mm:ss a') AS usage_end_time,
usage.usage_quantity AS dbu_consumed,
usage.usage_metadata.job_run_id,
usage.sku_name
FROM
system.billing.usage AS usage
WHERE
usage.usage_metadata.central_clean_room_id = '<central_clean_room_id>'
ORDER BY
usage_start_time ASC;
I initially thought the "job_run_id" corresponded to notebook executions within the Clean Room, but they don't seem to match. Could someone clarify what these IDs represent( job_name was "clean-room-station-job-task")?
Additionally, is there a way to track overall Clean Room billing usage at an individual level, such as costs for Delta Sharing, running notebooks, and expenses associated with individual collaborator?
Any insights would be greatly appreciated!
2 weeks ago
Hi @RohithChippa,
Databricks Clean Rooms are designed to facilitate secure and privacy-safe collaboration between multiple parties on sensitive data. Here are the details your questions:
Environment Hosting: Databricks Clean Rooms are hosted by Databricks. When you create a clean room, it establishes a central clean room, which is an isolated ephemeral environment managed by Databricks.
Data Sharing: Participants share data assets through Delta Sharing. This allows collaborators to share tables, volumes, and notebooks with the central clean room without exposing the underlying raw data directly to each other.
Compute Resources: Notebooks in Databricks Clean Rooms are run using serverless compute. This means that the compute resources are managed by Databricks, and the collaborators do not need to manage the infrastructure themselves.
Billing Details:
Charges for Databricks Clean Rooms: The creator of the clean room incurs charges based on a per collaborator per day fee. For example, a 2-party clean room will incur a 50chargeper24hours,whilea3−partycleanroomwillbecharged50 charge per 24 hours, while a 3-party clean room will be charged50chargeper24hours,whilea3−partycleanroomwillbecharged100 per 24 hours
Compute, Storage, and Data Transfer Costs: In addition to the per collaborator fee, the clean room creator is also responsible for the costs associated with compute, storage, and data transfer usage within the clean room. Compute is billed using the “Jobs Serverless” SKU for AWS users and the “Automated Serverless Compute” SKU for Azure users. Storage is billed using the “Databricks Storage” SKU.
a week ago
So If I add a collaborator in my clean room and left it for days, then I will be charged 50$ for each day?
Where can i see these charges?
I have created 2 workspaces in two different regions in my azure account and created a clean room in one of the workspace, added the other workspace as collaborator in that clean room. I have the clean room active but haven't seen any charges reference to adding collaborator in my monthly azure billing?
2 weeks ago
The job_run_id in your SQL query represents the unique identifier for a job run within the Databricks environment. This ID is used to track and manage the execution of jobs, including those related to Clean Rooms. The job_name "clean-room-station-job-task" indicates that the job is associated with tasks executed within a Clean Room setup.
Regarding tracking overall Clean Room billing usage at an individual level, including costs for Delta Sharing, running notebooks, and expenses associated with individual collaborators, the billing mechanism involves several components:
Please refer to: https://docs.databricks.com/en/admin/system-tables/index.html
a week ago
The charges for adding a collaborator to a Databricks Clean Room will begin once the first collaborator (other than the creator) accesses the clean room via the UI or API. The charge is $50 per collaborator, per 24 hours. This fee will continue to accrue regardless of activity until the clean room is deleted or inactive for more than 30 days.
Since Databricks Clean Rooms is currently in ungated public preview on Azure and AWS, and the per collaborator daily fee will only be applied when the product launches in General Availability (GA) at the end of January 2025, you will not see these charges in your monthly Azure billing until after the GA release.
For now, you should only be seeing charges related to compute consumption, data transfer, VMs, and storage associated with your clean room usage. These charges are billed at the Jobs Serverless rates and are incurred by the clean room creator.
You can monitor these charges in your Azure billing account, but the specific per collaborator daily fee will not appear until the GA release
a week ago
do you mean I will be get charged for all these days after January 2025 in my billing or do i get charged per collaborator charges if I start using clean room after GA release?
a week ago
You will only be charged the per collaborator fee starting from the General Availability (GA) release at the end of January 2025. The charges will not be retroactive for the days before the GA release
a week ago
Is it adding foreign catalog table in datacleanroom feature also available after GA release, cause I tried it but was not able to see the foreign catalog in add data assets tab
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group