cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

AWS DATABRICKS COMPUTE COSTS IN USAGE DASHBOARDS

margarita_shir
New Contributor III

Good afternoon!

I know that the built-in Databricks usage dashboards currently only display DBU (Databricks Unit) usage. Our Databricks workspaces are running on AWS, and the clusters provision EC2 instances automatically.

I would like to include the underlying AWS EC2 costs alongside DBU usage in these dashboards.

  • What is the recommended approach for capturing these EC2 costs?

  • What is the proper way to connect this data to Databricks and incorporate it into the workspace usage dashboards?

Any guidance, best practices, or example setups would be greatly appreciated.

Thank you!

4 REPLIES 4

Hubert-Dudek
Databricks MVP

There is a plan to include EC2 costs in cloud_infra_cost in the system tables. Not sure about its status, but the table is already visible.


My blog: https://databrickster.medium.com/

pradeep_singh
Contributor III

I guess the cloud_infra_cost table is still in private preview .

Other way would be . 

You can ingest AWS Cost and Usage Report in a S3 bucket that gets loaded into a UC table and join it to Databricks system tables . If you have your EC2 instance tagged appropriately and it propagates in CUR( AWS Cost and Usage Report  ) that will further help you with getting the right metrics

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

pradeep_singh
Contributor III

Another thread on cloud_infra_cost 
https://community.databricks.com/t5/administration-architecture/cloud-infra-costs/td-p/96544

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

Ashwin_DSA
Databricks Employee
Databricks Employee

Hi @margarita_shir,

If you are still looking for an answer... the right way to bring underlying EC2 cost into your reporting is to use Databricks system tables for the Databricks side of the picture and AWS billing data for the cloud side. The built-in usage dashboards are powered by system.billing.usage and system.billing.list_prices, so they’re great for DBU and Databricks SKU visibility, but the AWS EC2 costs themselves need to come from AWS billing data rather than from the Databricks usage tables alone.

On AWS, the recommended pattern is to use an AWS Cost and Usage Report (CUR) 2.0 export to S3, expose that location to Databricks using a Unity Catalog storage credential and external location, ingest the CUR data into Delta tables, and then join it with Databricks usage from system.billing.usage. Some customers may also see references to system.billing.cloud_infra_cost. That is a Databricks system table feature rather than an AWS-native table, and a Private Preview capability. Because of that, I’d still consider AWS billing data plus the documented Databricks system tables as the generally available path today, and treat system.billing.cloud_infra_cost as preview-only if it's enabled for that account. Databricks also published a recent blog on unifying Databricks and cloud infrastructure costs, and there is an open-source Cloud Infra Cost Field Solution for AWS that sets up the ingestion, modelling, and dashboarding pattern today.

A couple of practical details matter a lot for attribution. If you want to slice EC2 cost back to workspaces, clusters, teams, or cost centers, make sure you are using custom tags consistently, because those tags propagate to AWS EC2 and EBS resources and also show up in Databricks usage records. One important caveat is that when clusters are created from a pool, the underlying EC2 instances inherit workspace and pool tags, not cluster tags, so pool-level tagging becomes important for clean chargeback.

If the goal is specifically to extend the existing Databricks usage dashboards, the usual approach is to import the usage dashboard, copy or customise it, and point it at a modelled table or view that combines Databricks usage with AWS CUR data. Databricks also notes that the imported dashboards are customisable. If you use Usage Dashboard v2.0, that version is still marked as Preview in the docs as of now.

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***