cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Does Databricks have a google cloud Big Query equivalent of --dry_run to estimate costs before executing?

zach
New Contributor III

Databricks uses DBU's as a costing unit whether based onto of AWS/Azure/GCP and I want to know if Databricks has a google cloud Big Query equivalent of --dry_run for estimating costs? https://cloud.google.com/bigquery/docs/estimate-costs

4 REPLIES 4

-werners-
Esteemed Contributor III

Not that I know of.

Google uses number of bytes read to determine the cost.

Databricks uses DBU. The number of DBU's spent is not only dependent on the amount of bytes read (the more you read, the longer the program will run probably), but also the type of VM used.

Then there is also autoscaling which makes it harder to predict a price.

Also the total cost is not only DBU but also the provisioning cost of the VMs.

So that makes it pretty hard to predict a cost.

It would of course be very cool to have such a prediction.

โ€‹

zach
New Contributor III

Hi @Werner Stinckensโ€‹ thank you for taking the time to reply and for the thoughtful response. I find it hard to believe that so many companies are using the type of compute when the price is hard to know. I understand there is some ambiguity with the bytes read and cluster type, do you know of a way to give a rough estimate?

-werners-
Esteemed Contributor III

Databricks does give you a view on how many DBUs/hour a cluster consumes (from-to interval in case of autoscaling), see the cluster pane for this.

With that and a duration of the job, you can make an estimate. But the duration... for that you need to run the program (perhaps on small data and extrapolate).

This is a pretty rough estimate though. Maybe others have succeeded in doing this.

zach
New Contributor III

Hi Kaniz, unfortunately there are no answers in the thread. It would be good to get a steer from someone at Databricks if possible.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group