- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-10-2024 10:03 AM
I have workflows with multiple tasks, each of which need 5 different libraries to run. When I have to update those libraries, I have to go in and make the update in each and every task. So for one workflow I have 20 different places where I have to go through and update the libraries.
I need to be able to designate a list of libraries to be available on the job cluster for all the task that use it, so that I only have to update the libraries in one place.
But from what I can tell, an entirely new cluster definition gets created for job compute every time the workflow runs, so I don't have a single cluster to configure. What am I missing?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2024 12:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2024 08:22 AM
Actually I think I found most of a solution here in one of the replies: https://community.databricks.com/t5/administration-architecture/installing-libraries-on-job-clusters...
It seems like I only have to define libs for the first task, and as long as all other tasks use the same job compute, I'm good to go. I'm assuming tasks within a workflow share compute by default?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2024 12:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2024 05:34 AM
The libs I need to install are all private and not on Pypi. They are .whl files in repo folders. Can that be done with a requirements.txt file?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2024 10:29 PM - edited 07-11-2024 10:29 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2024 08:22 AM
Actually I think I found most of a solution here in one of the replies: https://community.databricks.com/t5/administration-architecture/installing-libraries-on-job-clusters...
It seems like I only have to define libs for the first task, and as long as all other tasks use the same job compute, I'm good to go. I'm assuming tasks within a workflow share compute by default?

