cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Equivalent Machine Types between Databricks on Azure and GCP

ranged_coop
Valued Contributor II

Hi All,

Hope everyone is doing well.

We are currently validating Databricks on GCP and Azure.

We have a python notebook that does some ETL (Copy, extract zip files and process files within the zip files)

Our Cluster Config on Azure

DBX Runtime - 10.4 - Driver - Standard DS4_v2 , Worker - Standard D8_v3 (4 Workers). (40 cores 156GB)

We tried similar Config on GCP

DBX Runtime - 11.3 - Driver - n2-highmem-4 , Worker - n2-standard-8 (4 Workers). (36 cores 160GB)

For same notebook with minor path changes, the runtimes seem to be very high in GCP compared to Azure - 1h increased to 3h

Since the notebook has not changed by much - maybe split of large functions into smaller ones and path changes, I was wondering if it might be due to the runtime change and the machine type.

  1. Is there any documented slow downs in DBX Runtime 11.3 compared to 10.4 ?
  2. Is there a table mapping equivalent machine types between Azure and GCP ? Google search shows similar groupings i.e. Compute Optimized in Azure vs Compute Optimized in GCP, but no one to one mapping.
  3. Does splitting a single function into multiple function cause such a huge difference in runtime ?

Thanks for all the help.

Cheers...

2 REPLIES 2

tunstila
Contributor II

Hi, @range_coop,

you might want to refer to the link below:

https://docs.databricks.com/release-notes/runtime/releases.html

ranged_coop
Valued Contributor II

hi @Tunde Abib​ , I have gone through the links while updating, but did not see any major documented slow downs mentioned in them. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group