06-02-2022 12:28 PM
I'm trying to install a non standard package into the cluster using the init scripts. The package I'm trying to install needs to be downloaded using wget, and uncompressed using tar. Then added to the PATH, or at least I need to know where the downloaded files live.
Here are the contents of my init scripts:
wget https://github.com/COMBINE-lab/salmon/releases/download/v1.8.0/salmon-1.8.0_linux_x86_64.tar.gz
tar xzvf salmon-1.8.0_linux_x86_64.tar.gz
SALMON_PATH=$(readlink -f ./salmon-1.8.0_linux_x86_64/bin/)
export PATH="$SALMON_PATH:$PATH"
But this isn't working, at least in a notebook.
06-02-2022 04:12 PM
@Bradley (Customer), You can try to provide the directory option while extracting the tar file.
Or you can download the file to the dbfs location and then use the init script to copy it to the path where it should be in the cluster.
06-02-2022 04:46 PM
@Bradley (Customer), wget will download the file on the driver, so then it will need to be moved to the filesystem. You can see the distributed file system with %sh ls /dbfs/
06-05-2022 09:10 PM
@Bradley Wright Could you please share the error message you are receiving when using the INIT script?
06-09-2022 12:47 AM
Hi @Bradley Wright , We haven’t heard from you on the last response from @Arvind Ravish, and I was checking back to see if you have a resolution yet. If you have any solution, please do share that with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.
06-20-2022 10:13 AM
Thank you for the support. Yes, I was able to find a working solution.
os.environ['PATH'] += ':/dbfs/FileStore/salmon/bin/'
And that was it! I stored the file paths to the input data in a dataframe, then used spark to iterate across the rows of the dataframe calling a custom function that calls the binary file.
06-21-2022 05:51 AM
Hi @Bradley Wright, Thank you for sharing the solution with the community. Please accept my deepest thanks. Would you mind marking the best answer for us?
07-15-2022 05:27 PM
I am happy that my suggestion added some pointers for you to resolve the issue.
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.