cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Install ODBC driver by init script

LukaszJ
Contributor III

Hello,

I want to install ODBC driver (for pyodbc).

I have tried to do it using terraform, however I think it is impossible.

So I want to do it with Init Script in my cluster. I have the code from the internet and it works when it is on the beginning of the cluster:

curl https://packages.microsoft.com/keys/microsoft.asc | apth-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release/list
apt-get update
ACCEPT_EULA=Y apt-get install msodbcsql17
apt-get -y install unixodbc-dev

So I make a init script:

file_path = "/databricks/init_script/my_scipy.bash"
file_content = """
curl https://packages.microsoft.com/keys/microsoft.asc | apth-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release/list"""
apt-get update
ACCEPT_EULA=Y apt-get install msodbcsql17
apt-get -y install unixodbc-dev"""
dbutils.fs.put(file_path, file_contnet, True)

And the problem is in 5 line (apt-get update).

Without the line driver does not works.

With the line cluster cannot start running because: Script exit status is non-zero

Do you know what should I do?

Best regards,

Łukasz

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

This is code that I am using for pyodbc init script + also in cluster/tasks settings I have added PyPI

pyodbc library.

dbutils.fs.put("/databricks/scripts/pyodbc-install.sh","""
#!/bin/bash
sudo apt-key add /dbfs/databricks/scripts/microsoft.asc
sudo cp -f /dbfs/databricks/scripts/prod.list /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install msodbcsql17""", True)

View solution in original post

6 REPLIES 6

Hubert-Dudek
Esteemed Contributor III

This is code that I am using for pyodbc init script + also in cluster/tasks settings I have added PyPI

pyodbc library.

dbutils.fs.put("/databricks/scripts/pyodbc-install.sh","""
#!/bin/bash
sudo apt-key add /dbfs/databricks/scripts/microsoft.asc
sudo cp -f /dbfs/databricks/scripts/prod.list /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install msodbcsql17""", True)

Hubert-Dudek
Esteemed Contributor III

Plus I downloaded permanently Microsoft libraries as they indeed crushed the server (often that page even haven't worked)

%sh
sudo curl -k https://packages.microsoft.com/keys/microsoft.asc > /dbfs/databricks/scripts/microsoft.asc 
sudo curl -k https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /dbfs/databricks/scripts/prod.list

LukaszJ
Contributor III

Thank you Hubert for help.

Your code works for me! 😊

Best regards,

Łukasz

Shourya
New Contributor III

Could you please share how you solved the problem? I'm facing the same issue

MayaBakh_80151
New Contributor II

Is there a way to install via a notebook only without using dbfs init script?
With new DBR policy of no init script. I **bleep** need to migrate shell script solution to a notebook on my workspace.
any leads where can we start?

 

MayaBakh_80151
New Contributor II

Actually found this article and using this to migrate my shell script to workspace.
Cluster-named and cluster-scoped init script migration notebook - Databricks 

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!