- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021
11:41 PM
- last edited
2 weeks ago
by
Advika
How is Cluster auto-scaling in Databricks different from Dynamic Allocation in Yarn
- Labels:
-
Cluster
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-24-2021 01:38 AM
Dynamic allocation is a Spark feature exclusive for Yarn. Dynamic allocation looks for the idleness of the executor and is not shuffle aware. External shuffle service is mandatory to use Dynamic allocation because of this reason.
Databricks auto-scaling is shuffle aware and does not need external shuffle service. The algorithm used for the scale-up and scale-down is very much efficient. Also, the auto-scaling in Databricks provides configurations to the user to control the aggressiveness of scaling which is not available in Yarn.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-24-2021 01:38 AM
Dynamic allocation is a Spark feature exclusive for Yarn. Dynamic allocation looks for the idleness of the executor and is not shuffle aware. External shuffle service is mandatory to use Dynamic allocation because of this reason.
Databricks auto-scaling is shuffle aware and does not need external shuffle service. The algorithm used for the scale-up and scale-down is very much efficient. Also, the auto-scaling in Databricks provides configurations to the user to control the aggressiveness of scaling which is not available in Yarn.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2023 12:56 AM
What does 'shuffle aware' mean in this context, and why does databricks not need an external shuffle service to scale up and down?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-03-2022 09:27 AM

