ODBC driver installation - help needed

DylanStout
Contributor

Hello, 

I’m trying to use pyodbc inside Databricks to connect to a SQL Server database, but I’m working in a restricted, offline Databricks workspace (no outbound internet).

What I’ve learned so far:

  • Databricks clusters do not include Microsoft’s ODBC Driver 17 or 18 by default.

  • I downloaded the .deb package manually:

    msodbcsql17_17.10.6.1-1_amd64.deb

    and uploaded it to:

    /Workspace/Users/<user>/odbc/
  • When I try to install it from a .py script using dpkg -i, it fails because:

    • Python jobs run as non-root, so dpkg → “requires superuser privilege”
    • Python jobs run on the driver only, not on executors
    • Installation would not persist across cluster restarts anyway

So my real goal is:

Install ODBC Driver 17 on all cluster nodes, offline, with no internet, and enable pyodbc to connect to SQL Server from Databricks.

From what I understand, the correct approach is:

  • Use an init script that installs the .deb file at cluster startup (runs as root),
  • Possibly install additional dependency .deb packages (libodbc1, unixodbc, odbcinst1debian2, etc.),

I’m looking for guidance from anyone who has successfully done an offline ODBC driver installation in Databricks.

Currently I am running this shell as init script:

#!/bin/bash
# install_msodbcsql17_offline.sh

# Where you uploaded the packages
PKG_DIR="/Workspace/Users/<me>/odbc"

# Install msodbcsql17 from local .deb
dpkg -i "${PKG_DIR}/msodbcsql17_17.10.6.1-1_amd64.deb" || true

 

However, this script never completes when the cluster is starting (stuck on "Running Init Scripts")