Hello,
Following the suggestion on this thread, for job clusters we install the libraries only on the first task of the workflow, which are then made available to the subsequent tasks.
However, this method is not reliable in the case of run repairs: the state of the cluster is not recovered, and therefore libraries are not installed because only the tasks following the failure are executed. This means the task containing the dependencies is not re-executed.
Unfortunately, attaching the dependencies to each and every task of the workflow is not an option, since the libraries seem to be reinstalled every time, leading to an increase in workflow execution time proportional to the number of tasks.
Are there any common solution to this problem?