cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot install TA-LIB via cluster libraries

Kash
Contributor III

Hi there,

I can't seem to find a way to install Ta-lib on a databricks server. I can manually install it in the notebook using the code below but if we detach the notebook then I have to install it again.

Please let me know if you've found a fix for this.

Thanks!

K

%sh
wget http://phoenixnap.dl.sourceforge.net/project/ta-lib/ta-lib/0.4.0/ta-lib-0.4.0-src.tar.gz \
  && sudo tar -xzf ta-lib-0.4.0-src.tar.gz \
  && sudo rm ta-lib-0.4.0-src.tar.gz \
  && cd ta-lib/ \
  && sudo ./configure --prefix=/usr \
  && sudo make \
  && sudo make install \
  && cd ~ \
  && sudo rm -rf ta-lib/ \
  && pip install ta-lib

5 REPLIES 5

Vivian_Wilfred
Databricks Employee
Databricks Employee

Hi @Avkash Kana​ , Have you tried uploading the library to DBFS and installing it on the cluster from there? You can click on the libraries tab on the cluster UI and add your library there.

https://docs.databricks.com/libraries/workspace-libraries.html#reference-an-uploaded-jar-python-egg-...

Kash
Contributor III

Hi Vivian,

Thanks for the note. I tried that and it shows the package was installed but when I try to import talib in the notebook it says no module names 'talib'

Any thoughts on what might be happening?

Thanks,

K

Vivian_Wilfred
Databricks Employee
Databricks Employee

@Avkash Kana​ Can you run a %sh pip list on a notebook attached to the cluster and check if the library is shown there?

Does it require any other dependencies to work? Please share the error trace when you import.

Hi Vivian,

Thanks for the quick reply.

When I install it manually using the notebook it shows up in pip-list but if I detach and re-attach it goes away.

I believe it does require other packages however we are already installing them though the cluster library I believe. If it needed a package that prevented the wheel from installing we would see an error right?

@Avkash Kana​ 

Please create an init script (see below) and attach the script to the cluster. This should resolve your issue.

%python
dbutils.fs.put("/databricks/scripts/ta-lib-install.sh","""
#!/bin/bash
/databricks/python/bin/pip install --upgrade numpy
wget http://phoenixnap.dl.sourceforge.net/project/ta-lib/ta-lib/0.4.0/ta-lib-0.4.0-src.tar.gz \
 && sudo tar -xzf ta-lib-0.4.0-src.tar.gz \
 && sudo rm ta-lib-0.4.0-src.tar.gz \
 && cd ta-lib/ \
 && sudo ./configure --prefix=/usr \
 && sudo make \
 && sudo make install \
 && cd ~ \
 && sudo rm -rf ta-lib/ \
 && pip install ta-lib""", True)

How to: https://docs.databricks.com/clusters/init-scripts.html#configure-a-cluster-scoped-init-script-using-the-ui

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group