Databricks Community

brickster_2018 · ‎06-23-2021

How is Cluster auto-scaling in Databricks different from Dynamic Allocation in Yarn

brickster_2018 · ‎06-24-2021

Dynamic allocation is a Spark feature exclusive for Yarn. Dynamic allocation looks for the idleness of the executor and is not shuffle aware. External shuffle service is mandatory to use Dynamic allocation because of this reason.

Databricks auto-scaling is shuffle aware and does not need external shuffle service. The algorithm used for the scale-up and scale-down is very much efficient. Also, the auto-scaling in Databricks provides configurations to the user to control the aggressiveness of scaling which is not available in Yarn.

View solution in original post

brickster_2018 · ‎06-24-2021

Dynamic allocation is a Spark feature exclusive for Yarn. Dynamic allocation looks for the idleness of the executor and is not shuffle aware. External shuffle service is mandatory to use Dynamic allocation because of this reason.

Databricks auto-scaling is shuffle aware and does not need external shuffle service. The algorithm used for the scale-up and scale-down is very much efficient. Also, the auto-scaling in Databricks provides configurations to the user to control the aggressiveness of scaling which is not available in Yarn.