cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Install ODBC driver by init script

LukaszJ
Contributor III

Hello,

I want to install ODBC driver (for pyodbc).

I have tried to do it using terraform, however I think it is impossible.

So I want to do it with Init Script in my cluster. I have the code from the internet and it works when it is on the beginning of the cluster:

curl https://packages.microsoft.com/keys/microsoft.asc | apth-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release/list
apt-get update
ACCEPT_EULA=Y apt-get install msodbcsql17
apt-get -y install unixodbc-dev

So I make a init script:

file_path = "/databricks/init_script/my_scipy.bash"
file_content = """
curl https://packages.microsoft.com/keys/microsoft.asc | apth-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release/list"""
apt-get update
ACCEPT_EULA=Y apt-get install msodbcsql17
apt-get -y install unixodbc-dev"""
dbutils.fs.put(file_path, file_contnet, True)

And the problem is in 5 line (apt-get update).

Without the line driver does not works.

With the line cluster cannot start running because: Script exit status is non-zero

Do you know what should I do?

Best regards,

ลukasz

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

This is code that I am using for pyodbc init script + also in cluster/tasks settings I have added PyPI

pyodbc library.

dbutils.fs.put("/databricks/scripts/pyodbc-install.sh","""
#!/bin/bash
sudo apt-key add /dbfs/databricks/scripts/microsoft.asc
sudo cp -f /dbfs/databricks/scripts/prod.list /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install msodbcsql17""", True)

View solution in original post

6 REPLIES 6

Hubert-Dudek
Esteemed Contributor III

This is code that I am using for pyodbc init script + also in cluster/tasks settings I have added PyPI

pyodbc library.

dbutils.fs.put("/databricks/scripts/pyodbc-install.sh","""
#!/bin/bash
sudo apt-key add /dbfs/databricks/scripts/microsoft.asc
sudo cp -f /dbfs/databricks/scripts/prod.list /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install msodbcsql17""", True)

Hubert-Dudek
Esteemed Contributor III

Plus I downloaded permanently Microsoft libraries as they indeed crushed the server (often that page even haven't worked)

%sh
sudo curl -k https://packages.microsoft.com/keys/microsoft.asc > /dbfs/databricks/scripts/microsoft.asc 
sudo curl -k https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /dbfs/databricks/scripts/prod.list

LukaszJ
Contributor III

Thank you Hubert for help.

Your code works for me! ๐Ÿ˜Š

Best regards,

ลukasz

Shourya
New Contributor III

Could you please share how you solved the problem? I'm facing the same issue

MayaBakh_80151
New Contributor II

Is there a way to install via a notebook only without using dbfs init script?
With new DBR policy of no init script. I **bleep** need to migrate shell script solution to a notebook on my workspace.
any leads where can we start?

 

MayaBakh_80151
New Contributor II

Actually found this article and using this to migrate my shell script to workspace.
Cluster-named and cluster-scoped init script migration notebook - Databricks 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group