cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Install the ODBC Driver 17 for SQL Server

GJ2
New Contributor II

Hi,

I am not a Data Engineer, I want to connect to ssas. It looks like it can be connected through 

pyodbc. however looks like  I need to install "ODBC Driver 17 for SQL Server" using the following command. How do i install the driver on the cluster and are there any requirement to connect with SSAS ?
 
 GJ2_1-1739798450883.png

Thank you 


 

 
9 REPLIES 9

Alberto_Umana
Databricks Employee
Databricks Employee

Hello @GJ2,

You will need to create a file as init script .sh and then put it into the server init script. https://docs.databricks.com/en/init-scripts/cluster-scoped.html

You can use this script:

\#!/bin/bash

# Add Microsoft package repository
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list | sudo tee /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update

# Install ODBC Driver 17 for SQL Server
sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql17

# Install additional required packages
sudo apt-get -y install unixodbc
sudo apt-get -y install libgssapi-krb5-2
sudo apt-get -y install odbcinst
sudo apt-get -y install odbcinst1debian2
sudo apt-get -y install libodbc1

 

Save the initialization script in a location accessible by Databricks, e.g., DBFS (Databricks File System / Workspace File).

Attach the initialization script to your Databricks cluster:

Go to the Databricks workspace.

  • Navigate to the cluster you want to install the driver on.
  • Click on the "Clusters" and then select the cluster.
  • Click on "Edit" to modify the cluster configuration.
  • Scroll down to the "Advanced Options" and select the "Init Scripts" tab.
  • Add a new script and provide the path to the script saved in DBFS / Workspace File .sh

 

 

Hey @Alberto_Umana do you know if Lakehouse Federation supports SSAS, or just the transactional catalog of SQL Server? I looked in the docs, SSAS isn't mentioned specifically, and now I'm curious.

I have the same requirement of installing the Microsoft ODBC driver. I added your code in the init script and I got the following error:  "databricks_error_message": "Cluster scoped init script /Volumes/packages/install_sqlserver_driver.sh failed: Script exit status is non-zero". How can we solve it?

dinarde
New Contributor II

I had this problem with Init Script and changing from CRLF to LF (Unix) solved it:

Incorrect Line Endings (CRLF vs. LF): If the script was created on a Windows system and uploaded to a Unity Catalog volume, it might have Windows-style line endings (CRLF). Linux environments (where Databricks clusters run) expect LF line endings.

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @Rjdudley - SSAS isn't supported only SQL server.

GJ2
New Contributor II

Hi @Alberto_Umana Basically what I want to do is get all the tables used in Power BI datasets. I am trying to connect to xmla endpoint through SSMS and get the tables names used in Power BI dataset. If I do it directly in SSAS then I can only query one dataset. I want to loop through all the PBI datasets (databases when connected through SSAS) and get the tables used in each dataet. Is there any better way to do this ? Can I do this in databricks notebook? Thanks heaps 

karifclark
New Contributor II

To connect to SSAS using pyodbc, you're right that installing the "ODBC Driver 17 for SQL Server" is required. You can install it on your cluster using the appropriate system package manager (e.g., apt for Ubuntu or yum for CentOS), and make sure the driver is accessible from all nodes if it's a distributed environment. Also, ensure that your connection string includes the correct SSAS server name and authentication details. Just like you wouldnโ€™t stream without first installing a proper appโ€”such as visiting a reliable site like Tele-Latino to download a free streaming appโ€”you canโ€™t connect to SSAS without the correct driver in place. Proper setup is key for smooth access.

karifclark
New Contributor II

You are correct that installing the "ODBC Driver 17 for SQL Server" is necessary in order to connect to SSAS using pyodbc. Using the proper system package manager (apt for Ubuntu or yum for CentOS), you may install it on your cluster. If it's a distributed environment, ensure sure the driver is reachable from every node. Additionally, make sure that the relevant SSAS server name and authentication information are included in your connection string. You cannot connect to SSAS without the relevant driver installed, just as you wouldn't stream without first installing a suitable softwareโ€”for example, downloading a free streaming app from a reputable website like Tele-Latino. For easy access, proper setup is essential.

Sheilaschaffer
New Contributor II

It looks like the error you're encountering is due to the format of the `table_name` argument you're passing to the `jdbc_writer` method. The line `schema, table = table_name.split('.')` expects a string with exactly one period to split into two partsโ€”typically in the form `"schema.table"`. However, in your case, you're passing just `"Economy"`, which only contains one part, hence the unpacking error. To fix this, make sure you're providing a fully qualified table name or whatever your schema/table naming convention is. Also, if you're running this in a production pipeline or automated environment, I strongly recommend installing and configuring Delta Executor v655, which helps ensure compatibility and streamlined execution across distributed jobs.