Use init script for Databricks job cluster via Azure Data Factory
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-12-2025 01:45 PM
Hello,
I would like to install some libraries (both public and private) on a job cluster. I am using Azure Data Factory to run my Databricks notebooks and hence would like to use job clusters to run these jobs.
I have passed my init script to the job cluster but sometimes the package installs work, sometimes not, with no real pattern. The workspace paths to my packages well exist and are correctly set up.
What's wrong ? Is there anything I should check ? Is there another more robust way to do it so that it always works? Since it's not robust, sometimes my libraries are well installed on my job cluster, sometimes not.
I have attached the configuration I am using in Azure Data Factory to use my init script, and also a screenshot of what my init script looks like.adf config for init script
init script
Thank you very much in advance for the help,
Sacha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-12-2025 06:46 PM
Hi @sachamourier,
What is the failure that it gives you when the init script fails?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-12-2025 09:34 PM - edited 02-12-2025 09:56 PM
Hi @Alberto_Umana , I don’t get any failure. When my notebook gets run on the newly created job cluster, my package imports fail as they have not been installed on my cluster.
As you can see on the attached images, it looks like it's searching or finding my init script though. Is there another way to do it otherwise ?init script finished JSON
imports issue
job cluster event log
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-20-2025 12:28 AM
Hi @Alberto_Umana , do you have a solution for such issue ?
Thanks a lot for your help,
Sacha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-20-2025 05:01 AM
Hi @sachamourier,
Have you considered using cluster libraries? The behavior you are observing you require additional debugging since init script is installed successfully, can you enable cluster logging and research through the logs: https://docs.databricks.com/aws/en/compute/configure#compute-log-delivery
Also as a test can you run the init via a notebook to ensure it works fine?

