cancel
Showing results for 
Search instead for 
Did you mean: 
Community Discussions
cancel
Showing results for 
Search instead for 
Did you mean: 

How to choose a compute, and how to find alternatives for the current compute being used?

Ikanip
New Contributor II

We are using a compute for an Interactive Cluster in Production which incurs X amount of cost. We want to know what are the options available to use with near about the same processing power as the current compute but incur a cost of Y, which is lesser than X.

4 REPLIES 4

raphaelblg
New Contributor III
New Contributor III

Hello @Ikanip ,

You can utilize the Databricks Pricing Calculator to estimate costs.

For detailed information on compute capacity, please refer to your cloud provider's documentation regarding Virtual Machine instance types.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

Ikanip
New Contributor II

Hi Raphael,

Thanks for this. 

I will give you a specific example.

Lets say for Azure I choose DS5 V2 which has 16 cores and 56G of RAM costs $1,861.500/month as PayG. and then I choose D16s v3 which also has 16 cores and 64G of RAM costs $1,489.20/month as PayG which is lesser than the former. They are probably using the same 3rd Generation Intel® Xeon® Processors. But what is the difference and why is the difference in costing?

 

 

 

raphaelblg
New Contributor III
New Contributor III

Hi @Ikanip I suggest checking this with the cloud provider, unfortunately I don't have the details. Databricks cost estimation relies to some extent on cloud provider cost estimation.

 

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

This is exactly how @raphaelblg mentioned.
You have to dig into the MS docs about VM size. 

You cant look just on "hey it is less memory, why is more expensive" ? It is not jus that.

In your example where you compare Dv2 to Dv3 Series you can find in docs that MS changed Memory to CPU ratio so it will be more efficient and also it runs in hyper-threaded configuration. They also adjusted disk and network limits to align with the move to hyperthreading.
Hyperthreading = improve parallelization of computations.

In DS5 V2 you have much higher IOPS and network bandwidth.

It is advised that you move to  Ev3 and Esv3-series if you look for Memory optimized machines. 

Please be aware of azure regions, there might be situation when machine "X" is more expensive then machine "Y" but in other region it might not be the same 🙂  

So if you swap you compute, you might see drop in performance.

If you are looking for some savings you need to:
- test different VMs 
- check spot instances 
- run Job clusters instead (maybe with pool for faster start-up)

I hope that I was able to clarify few things for you 🙂