When you have a job in Workflows with multiple tasks running after one another, there seems to be a consistent 7 seconds delay between execution of the tasks. Or, more precisely, every task has an approximate 7 second overhead before the code actually runs. Does anybody know why, or if there is some workaround
We've not tested this in every possible setup, but here's what we did:
Created a notebook with a single print statement [print("Hello world")]. This takes milliseconds to execute in the notebook itself. Created a job with 3 or more tasks, each running the same notebook. We ran the job using both job cluster and all purpose cluster with driver + 2 workers with 4 cores. When you run the job each task takes about 7 seconds to complete.
This delay might be negligible on larger jobs, but we have some smaller jobs that need to run often. If we use workflow tasks, these delays will in some cases double the run time, which is unacceptable.