cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to Retrieve DBU Count per Compute Type for Accurate Cost Calculation?

saicharandeepb
New Contributor III

Hello Everyone,

We are currently working on a cost analysis initiative to gain deeper insights into our Databricks usage. As part of this effort, we are trying to calculate the hourly cost of each Databricks compute instance by utilizing the Azure Retail Prices API and storing this data in a table for further analysis.

During this process, we encountered a challenge in accurately calculating the cost per compute type. After consulting with Azure Support, we understood that the total cost displayed on Azure Databricks Pricing Page is composed of two components: the VM (infrastructure) cost and the DBU (Databricks Unit) cost.

While we are able to determine the VM cost using the Azure API, we are unable to map the correct number of DBUs per compute type to complete the total cost calculation. We have not found a reliable way to determine the DBU count associated with each VM/compute type.

My questions to the community:

  • Is there a documented way to retrieve the DBU count per VM or compute type?

  • Is this information available through any Databricks APIs or system-level tables that we can query?

  • Has anyone built a similar cost model and can share tips or best practices?

Any guidance or pointers would be really appreciated!

Thanks in advance,

Charan.

4 REPLIES 4

BS_THE_ANALYST
Esteemed Contributor II

@saicharandeepb  have you looked at the system billing tables in Databricks yet? https://learn.microsoft.com/en-us/azure/databricks/admin/system-tables/billing 

BS_THE_ANALYST_0-1761044017483.png

There seems to be a field that can display the unit usage in DBU. 

Same in this table aswell:
https://learn.microsoft.com/en-us/azure/databricks/admin/system-tables/pricing

BS_THE_ANALYST_1-1761044121224.png

 


All the best,
BS

Chiran-Gajula
New Contributor

The majority of the relevant information can be found in the system.billing.usage and system.compute.clusters tables.
To view DBUs by instance type or compute, you can run the following query and explore more of the fields in these tables

SELECT
c.worker_node_type AS vm_instance,
SUM(u.usage_quantity) AS total_dbus
FROM
system.billing.usage u
JOIN
system.compute.clusters c
ON
u.usage_metadata.cluster_id = c.cluster_id
GROUP BY
c.worker_node_type
ORDER BY
total_dbus DESC;

G.Chiranjeevi

nayan_wylde
Honored Contributor III

1. Is there a documented way to retrieve the DBU count per VM or compute type?

Yes, but it's not directly exposed via a single API or table. The DBU consumption rate depends on:

Compute type (Jobs Compute, All-Purpose Compute, SQL Compute, etc.)
VM instance type (e.g., Standard_D16s_v3)
Databricks pricing tier (Standard, Premium, Enterprise)
Cloud provider (Azure, AWS, GCP)

Databricks provides DBU calculators and pricing matrices per cloud provider, which list DBU rates per instance type and workload type. For Azure, you can refer to the https://www.databricks.com/product/pricing/product-pricing/instance-types

 

 

2. Is this information available through any Databricks APIs or system-level tables?
Yes. You can use Databricks system tables to analyze DBU consumption:
Key Tables:
 
system.billing.usage: Tracks DBU usage per workload, SKU, and time window. [learn.microsoft.com]
system.compute.clusters: Contains cluster metadata including node types and configurations.
system.compute.node_types: Maps node types to hardware specs (useful for correlating with DBU rates).
 
SELECT 
  usage_date,
  sku_name,
  usage_quantity AS dbus_consumed,
  usage_metadata.cluster_id,
  usage_metadata.job_id
FROM system.billing.usage
WHERE usage_unit = 'DBU'
ORDER BY usage_date DESC; 

3. Has anyone built a similar cost model and can share tips or best practices?
Yes, Some of the best practices that I follow are:

  • Used job clusters + spot instances to reduce DBU and VM costs by up to 50%.
  • Tagged clusters with environment and workload type for granular cost attribution.
  • Use cluster policies to restrict high-cost instance types and enforce auto-termination.

 

saicharandeepb
New Contributor III

Hi everyone, just to clarify my question — I’m looking for the DBU count per compute type (per instance type), not the total DBU consumption per workload.

In other words, I want to know the fixed DBU rate assigned to each compute SKU (for example, DS3 v2 = 0.75 DBU/hour, DS4 v2 = 1.5 DBU/hour, DS5 v2 = 3.0 DBU/hour, etc.) so that I can accurately estimate costs for different cluster configurations.

saicharandeepb_0-1761200133677.png

 

I’m not referring to usage or billing metrics that show total DBUs consumed by workloads over time — I just need the reference values that define how many DBUs are billed per hour for each compute type.

Thanks in advance for any guidance.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now