cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Best practice for unified cloud cost attribution (Databricks + Azure)?

Mortenfromdk
New Contributor

Hi! I’m working on a FinOps initiative to improve cloud cost visibility and attribution across departments and projects in our data platform. We do tagging production workflows on department level and can get a decent view in Azure Cost Analysis by filtering on tags like department: X. But I am struggling to bring Databricks into that picture — especially when it comes to SQL Serverless Warehouses.

My goal is to be able to print out: total project cost = azure stuff + sql serverless.

Questions:

1. Tagging Databricks SQL Warehouses for Attribution

Is creating a separate SQL Warehouse per department/project the only way to track department/project usage or is there any other way?

2. Joining Azure + Databricks Costs

Is there a clean way to join usage data from Azure Cost Analysis with Databricks billing data (e.g., from system.billing.usage)?

I'd love to get a unified view of total cost per department or project — Azure Cost has most of it, but not SQL serverless warehouse usage or Vector Search or Model Serving.

3. Sharing Cost

For those of you doing this well — how do you present project-level cost data to stakeholders like departments or customers?

1 REPLY 1

mark_ott
Databricks Employee
Databricks Employee

To achieve unified cloud cost visibility and attribution for Azure and Databricks (including SQL Serverless Warehouses), consider the following best practices and solutions:

Tagging Databricks SQL Warehouses for Attribution

Creating a separate SQL Warehouse per department/project is one way to track usage, but it's not the only method. Databricks supports tagging resources—including SQL Warehouses, endpoints, and jobs—with custom key-value pairs (tags) such as department, environment, or project name. These tags are stored on each asset and propagate to system tables like system.billing.usage, enabling granular cost attribution and aggregation over any category without requiring separate warehouses for each department. For automated consistency, use cluster policies or Infrastructure-as-Code (IaC) (e.g., Terraform) to enforce tag blocks for new and existing warehouses and jobs.​

Joining Azure + Databricks Costs

You can join Azure Cost Analysis data (filtered by tags) with Databricks billing data (from system.billing.usage). While Azure Cost Analysis aggregates Databricks charges as a single line item (with detailed infra breakdown below), SQL Serverless or advanced Databricks features (Vector Search, Model Serving) require querying Databricks system tables directly. Both data sources can be exported and joined in BI tools (like PowerBI or Tableau), via a data lake, or programmatically (Python, SQL) to produce a unified cost view at department/project level. Use matching tag structures (department, project, environment) in both environments to align and join records accurately.​

Sharing Cost Information with Stakeholders

Successful organizations present unified cost reports visually, using dashboards or BI tools with drilldowns by department, project, or customer. Best practices include:

  • Consistent tag taxonomy across Azure and Databricks for seamless data merging.

  • Automated export and processing of cost data, with regular updates (daily/weekly).

  • Visualizations—charts, trend graphs, breakdown tables—to highlight usage patterns and major cost drivers.

  • Clear documentation and guidance, helping stakeholders interpret the data and control their own spend.

  • Reports delivered via cloud cost management portals, dashboards (PowerBI/Tableau), or periodic email summaries for maximum accessibility.​

Practical Tips

  • Start with a clear tagging standard on both Azure and Databricks, enforced via automation.​

  • Regularly audit for untagged resources and retroactively apply tags as needed.​

  • Leverage system.billing.usage custom tag columns to aggregate costs by any variable (department, customer, etc.) for SQL Warehouses, Model Serving, Vector Search, and other advanced services.​

  • Use both Azure and Databricks RBAC to limit access to sensitive cost reporting and ensure only authorized users view detailed breakdowns.​

Implementing these strategies can provide your organization with a unified, project-level cloud cost breakdown, increasing transparency, accountability, and actionable insights for all stakeholders.​