enable dynamic resource allocation on job cluster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wednesday - last edited Wednesday
I have a databricks job having two task those will run each alone or both parallel (will be controlled by if conditional task). When it runs parallel, one task is running for long time, but the same task finish quick when it runs alone. particularly the same one task is running for long when runs both in parallel, futher analysis identified that the specific task (databricks task) is assigned with FIFO pool in spark ui, other one is FAIR.
What could be the cause of long running that one same task. Can we use Dynamic resource allocation in job cluster for this scenario or anything to do with schedular pool.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yesterday
Hello @Abser786,
There is a difference between Dynamic Resource Allocation and the Scheduler policy
Dynamic Resource Allocation means getting more compute as needed if current compute is totally consumed, this can be achieved by autoscaling feature/config in job cluster resources (details here)
On the other hand scheduler policy which seems to be different across tasks in the case you mentioned can be controlled and aligned, I think a best approach if autoscaling is not leveraged in this case a FAIR scheduler can be used in the two tasks, this can be done by setting spark.scheduler.mode
to FAIR
in the job or cluster configuration to ensure the tasks use the FAIR scheduling mode rather than the default FIFO
Regards

