Is it recommended to turn on Spark speculative execution permanently

brickster_2018
Databricks Employee
Databricks Employee

I had a job where the last step will get stuck forever. Turning on spark speculative execution did magic and resolved the issue.

Is it safe to turn on Spark speculative execution permanently.

brickster_2018
Databricks Employee
Databricks Employee

It's not recommended to turn of Spark speculative execution permanently. For jobs where tasks are running slow or stuck because of transient network or storage issues, speculative execution can be very handy. However, it suppresses the actual problem and performs a retry of the task.

Speculative execution should be treated as a temporary workaround until finding the root cause of why the task or job is stuck.

Speculative execution can cause unnecessary task retries and can degrade the performance of jobs/stages where there is no true task stuck scenarios.

View solution in original post