โ06-23-2021
11:41 PM
- last edited on
โ03-21-2025
05:42 AM
by
Advika
How is Cluster auto-scaling in Databricks different from Dynamic Allocation in Yarn
โ06-24-2021 01:38 AM
Dynamic allocation is a Spark feature exclusive for Yarn. Dynamic allocation looks for the idleness of the executor and is not shuffle aware. External shuffle service is mandatory to use Dynamic allocation because of this reason.
Databricks auto-scaling is shuffle aware and does not need external shuffle service. The algorithm used for the scale-up and scale-down is very much efficient. Also, the auto-scaling in Databricks provides configurations to the user to control the aggressiveness of scaling which is not available in Yarn.
โ06-24-2021 01:38 AM
Dynamic allocation is a Spark feature exclusive for Yarn. Dynamic allocation looks for the idleness of the executor and is not shuffle aware. External shuffle service is mandatory to use Dynamic allocation because of this reason.
Databricks auto-scaling is shuffle aware and does not need external shuffle service. The algorithm used for the scale-up and scale-down is very much efficient. Also, the auto-scaling in Databricks provides configurations to the user to control the aggressiveness of scaling which is not available in Yarn.
โ03-17-2023 12:56 AM
What does 'shuffle aware' mean in this context, and why does databricks not need an external shuffle service to scale up and down?
โ08-03-2022 09:27 AM
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now