SSundaram
Databricks Partner

Try increasing your max capacity limit and might want to bring down the min number of nodes the job uses.

At the job level try configuring retry and time interval between retries. 

View solution in original post