cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Install ODBC driver by init script

LukaszJ
Contributor III

Hello,

I want to install ODBC driver (for pyodbc).

I have tried to do it using terraform, however I think it is impossible.

So I want to do it with Init Script in my cluster. I have the code from the internet and it works when it is on the beginning of the cluster:

curl https://packages.microsoft.com/keys/microsoft.asc | apth-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release/list
apt-get update
ACCEPT_EULA=Y apt-get install msodbcsql17
apt-get -y install unixodbc-dev

So I make a init script:

file_path = "/databricks/init_script/my_scipy.bash"
file_content = """
curl https://packages.microsoft.com/keys/microsoft.asc | apth-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release/list"""
apt-get update
ACCEPT_EULA=Y apt-get install msodbcsql17
apt-get -y install unixodbc-dev"""
dbutils.fs.put(file_path, file_contnet, True)

And the problem is in 5 line (apt-get update).

Without the line driver does not works.

With the line cluster cannot start running because: Script exit status is non-zero

Do you know what should I do?

Best regards,

Łukasz

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

This is code that I am using for pyodbc init script + also in cluster/tasks settings I have added PyPI

pyodbc library.

dbutils.fs.put("/databricks/scripts/pyodbc-install.sh","""
#!/bin/bash
sudo apt-key add /dbfs/databricks/scripts/microsoft.asc
sudo cp -f /dbfs/databricks/scripts/prod.list /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install msodbcsql17""", True)

View solution in original post

6 REPLIES 6

Hubert-Dudek
Esteemed Contributor III

This is code that I am using for pyodbc init script + also in cluster/tasks settings I have added PyPI

pyodbc library.

dbutils.fs.put("/databricks/scripts/pyodbc-install.sh","""
#!/bin/bash
sudo apt-key add /dbfs/databricks/scripts/microsoft.asc
sudo cp -f /dbfs/databricks/scripts/prod.list /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install msodbcsql17""", True)

Hubert-Dudek
Esteemed Contributor III

Plus I downloaded permanently Microsoft libraries as they indeed crushed the server (often that page even haven't worked)

%sh
sudo curl -k https://packages.microsoft.com/keys/microsoft.asc > /dbfs/databricks/scripts/microsoft.asc 
sudo curl -k https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /dbfs/databricks/scripts/prod.list

LukaszJ
Contributor III

Thank you Hubert for help.

Your code works for me! 😊

Best regards,

Łukasz

Shourya
New Contributor III

Could you please share how you solved the problem? I'm facing the same issue

MayaBakh_80151
New Contributor II

Is there a way to install via a notebook only without using dbfs init script?
With new DBR policy of no init script. I **bleep** need to migrate shell script solution to a notebook on my workspace.
any leads where can we start?

 

MayaBakh_80151
New Contributor II

Actually found this article and using this to migrate my shell script to workspace.
Cluster-named and cluster-scoped init script migration notebook - Databricks 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.