You have two related questions on Databricks SQL Warehouse (serverless) cost attribution and resource sharing across workspaces. Here’s a clear answer based on current Databricks capabilities and industry practice.
Tracing Costs per Use Case (SQL Warehouse/Serverless)
Direct cost attribution within Databricks for individual use cases—such as by Unity Catalog, catalog, or schema—is not natively supported to the granularity you describe, if you use a single SQL warehouse per workspace. Costs are generally tracked by warehouse or cluster, not by executed queries or the accessed metastore/catalog. However, there are some strategies to improve cost visibility:
Available Approaches
-
Tags on Warehouses/Clusters: While you already know about tags, these assign costs per resource, not per query or catalog accessed. If use cases share a warehouse, this won’t help for fine-grained analysis.
-
Databricks System Billing Usage Table: This table provides data at the warehouse/cluster/user/job level, but not by Unity Catalog object (catalog/schema).
-
Audit Logs & Query History: Databricks captures query history, user activity, and can log which Unity Catalog objects (catalog and schema/table) are accessed per query.
Custom Cost Attribution by Query Audit
-
Export Query Logs: Query logs can be exported or accessed via the information_schema.query_history or system.access.audit table. These logs capture user/query, catalog, schema, and table access.
-
Estimate Per-Query Costs: Combine query logs with warehouse usage metrics (total cost, execution time, etc.). You can attribute a share of warehouse cost to each query based on execution duration, data scanned, or other metrics.
-
Aggregate by Catalog/Schema: Aggregate at your required level (catalog or schema) to estimate costs using the derived per-query share.
This approach won’t be perfectly precise, but it gives a data-driven approximation for cost per use case (“catalog” or “schema”) within a warehouse.
Why True Metering per Catalog Isn’t Native
Databricks bills compute at the warehouse or cluster level—so native metering by schema/catalog is unavailable. Fine-grained metering risks performance hits and audit log inflation; custom analysis using exported logs is the best workaround.
Sharing SQL Warehouse/Serverless Across Workspaces
Currently, Databricks SQL Warehouses (serverless or classic) are not shareable natively across workspaces. A warehouse is scoped to a single workspace, and access is managed by its workspace-level security context.
Typical Approaches for Analytics Frontends
-
Direct Workspace Connection: Tools like Power BI or Tableau must connect to a specific workspace's SQL Warehouse endpoint.
-
Repeat Setup: For shared consumption across multiple workspaces, replicate warehouse configuration (and data sources) in each workspace.
-
Unity Catalog Data Sharing: Data (not compute) can be shared across workspaces using Unity Catalog's data sharing capabilities, but not the SQL Warehouse compute resource itself.
Alternatives
-
Common Workspace for Shared Analytics: Set up one "analytics" workspace with shared SQL Warehouses, and connect frontends there.
-
APIs/External Query Services: Build APIs or service layers that sit atop the warehouse and serve multiple frontends, but this requires architectural overhead.
References
Summary:
Cost analysis per catalog or schema in a single warehouse requires exporting query logs and creatively attributing fractions of warehouse cost. SQL Warehouses cannot currently be shared across workspaces—each workspace requires its own warehouse endpoint. Data sharing works via Unity Catalog, but not compute sharing.