cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Installing libraries on job clusters using tasks dependencies is not reliable in case of repairs

aliz
Visitor

Hello,

Following the suggestion on this thread, for job clusters we install the libraries only on the first task of the workflow, which are then made available to the subsequent tasks.
However, this method is not reliable in the case of run repairs: the state of the cluster is not recovered, and therefore libraries are not installed because only the tasks following the failure are executed. This means the task containing the dependencies is not re-executed.

Unfortunately, attaching the dependencies to each and every task of the workflow is not an option, since the libraries seem to be reinstalled every time, leading to an increase in workflow execution time proportional to the number of tasks.

Are there any common solution to this problem?

0 REPLIES 0