I'm new to Databricks and have been tasked with exploring Databricks Clean Rooms. I'm a bit confused about how billing works for Clean Rooms and their overall functionality. Specifically, I'm curious about the following:
Environment Hosting: Are Clean Room environments hosted by Databricks?
Data Sharing: Do participants share data assets through Delta Sharing?
Compute Resources: Are notebooks run using serverless compute?
Billing Details: How are charges applied for Databricks Clean Rooms? Are costs incurred for Delta Sharing and serverless compute?
To investigate, I ran the following SQL query:
SELECT
usage.usage_type,
usage.usage_date,
DATE_FORMAT(FROM_UTC_TIMESTAMP(usage.usage_start_time, '<region>'), 'yyyy-MM-dd hh:mm:ss a') AS usage_start_time,
DATE_FORMAT(FROM_UTC_TIMESTAMP(usage.usage_end_time, '<region>'), 'yyyy-MM-dd hh:mm:ss a') AS usage_end_time,
usage.usage_quantity AS dbu_consumed,
usage.usage_metadata.job_run_id,
usage.sku_name
FROM
system.billing.usage AS usage
WHERE
usage.usage_metadata.central_clean_room_id = '<central_clean_room_id>'
ORDER BY
usage_start_time ASC;
I initially thought the "job_run_id" corresponded to notebook executions within the Clean Room, but they don't seem to match. Could someone clarify what these IDs represent( job_name was "clean-room-station-job-task")?
Additionally, is there a way to track overall Clean Room billing usage at an individual level, such as costs for Delta Sharing, running notebooks, and expenses associated with individual collaborator?
Any insights would be greatly appreciated!