cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot install TA-LIB via cluster libraries

Kash
Contributor III

Hi there,

I can't seem to find a way to install Ta-lib on a databricks server. I can manually install it in the notebook using the code below but if we detach the notebook then I have to install it again.

Please let me know if you've found a fix for this.

Thanks!

K

%sh
wget http://phoenixnap.dl.sourceforge.net/project/ta-lib/ta-lib/0.4.0/ta-lib-0.4.0-src.tar.gz \
  && sudo tar -xzf ta-lib-0.4.0-src.tar.gz \
  && sudo rm ta-lib-0.4.0-src.tar.gz \
  && cd ta-lib/ \
  && sudo ./configure --prefix=/usr \
  && sudo make \
  && sudo make install \
  && cd ~ \
  && sudo rm -rf ta-lib/ \
  && pip install ta-lib

5 REPLIES 5

Vivian_Wilfred
Honored Contributor
Honored Contributor

Hi @Avkash Kana​ , Have you tried uploading the library to DBFS and installing it on the cluster from there? You can click on the libraries tab on the cluster UI and add your library there.

https://docs.databricks.com/libraries/workspace-libraries.html#reference-an-uploaded-jar-python-egg-...

Kash
Contributor III

Hi Vivian,

Thanks for the note. I tried that and it shows the package was installed but when I try to import talib in the notebook it says no module names 'talib'

Any thoughts on what might be happening?

Thanks,

K

Vivian_Wilfred
Honored Contributor
Honored Contributor

@Avkash Kana​ Can you run a %sh pip list on a notebook attached to the cluster and check if the library is shown there?

Does it require any other dependencies to work? Please share the error trace when you import.

Hi Vivian,

Thanks for the quick reply.

When I install it manually using the notebook it shows up in pip-list but if I detach and re-attach it goes away.

I believe it does require other packages however we are already installing them though the cluster library I believe. If it needed a package that prevented the wheel from installing we would see an error right?

@Avkash Kana​ 

Please create an init script (see below) and attach the script to the cluster. This should resolve your issue.

%python
dbutils.fs.put("/databricks/scripts/ta-lib-install.sh","""
#!/bin/bash
/databricks/python/bin/pip install --upgrade numpy
wget http://phoenixnap.dl.sourceforge.net/project/ta-lib/ta-lib/0.4.0/ta-lib-0.4.0-src.tar.gz \
 && sudo tar -xzf ta-lib-0.4.0-src.tar.gz \
 && sudo rm ta-lib-0.4.0-src.tar.gz \
 && cd ta-lib/ \
 && sudo ./configure --prefix=/usr \
 && sudo make \
 && sudo make install \
 && cd ~ \
 && sudo rm -rf ta-lib/ \
 && pip install ta-lib""", True)

How to: https://docs.databricks.com/clusters/init-scripts.html#configure-a-cluster-scoped-init-script-using-the-ui

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.