mark_ott
Databricks Employee
Databricks Employee

You are encountering an error because the default SQL Server JDBC driver bundled with Databricks may not fully support the authentication value "ActiveDirectoryManagedIdentity"—this option requires at least version 10.2.0 of the Microsoft SQL Server JDBC driver, and often custom configuration in Databricks to properly reference the newer driver and required authentication libraries.

Why This Happens

  • Default Driver Limitation: The default Databricks SQL Server connector uses an older Microsoft JDBC driver, which does not support the "ActiveDirectoryManagedIdentity" authentication method.

  • Custom Driver Needed: To use managed identity authentication (especially with a User Assigned Managed Identity), you must upload a newer version of mssql-jdbc (at least v10.2.0 or later) and the Microsoft Authentication Library (MSAL) JARs to your Databricks cluster.


Steps to Override the SQL Server JDBC Driver in Databricks

1. Download Required JAR Files

Download the following JARs:

  • Microsoft JDBC Driver for SQL Server (mssql-jdbc-10.2.0.jre8.jar or later)

  • Microsoft Authentication Library (MSAL) (msal4j-1.x.x.jar)

  • Dependent libraries (included with the driver or required by MSAL)

2. Upload JARs to Databricks

  • Go to Databricks Workspace > Clusters > your-cluster > Libraries.

  • Click Install New and upload the driver and MSAL JAR files.

3. Configure Your Notebook/Cluster

  • Attach the uploaded libraries to your Databricks cluster.

  • Restart the cluster to ensure the new driver is used.

4. Notebook Script Update

You do not specify user or password—you only need the client id for your Managed Identity.
Also, make sure the options reference the right values.

python
driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver" database_host = "xxxxx" database_port = "1433" database_name = "xxxx" UserAssignedMiClientId = "xxxxxx" # This is the client id of your managed identity table = "CAS_DEF" jdbc_url = ( f"jdbc:sqlserver://{database_host}:{database_port};" f"database={database_name};" f"authentication=ActiveDirectoryManagedIdentity;" f"msiClientId={UserAssignedMiClientId};" "encrypt=true;trustServerCertificate=true" ) df = (spark.read .format("jdbc") .option("url", jdbc_url) .option("dbtable", table) .option("driver", driver) .load())

Key Points:

  • Use "jdbc" (not "sqlserver") as the format.

  • Provide the full jdbc_url via the .option("url", ...).

  • Omit explicit user and password.

5. Databricks Permissions

  • Ensure your Databricks cluster or workspace is assigned the User Assigned Managed Identity, and that this identity has Azure AD and SQL Server permissions.


Troubleshooting

  • Driver Not Found: If you still get errors about the authentication type, double-check that the right driver version is installed and attached to the cluster.

  • Permission Issues: Make sure the managed identity has the correct roles in Azure and in SQL Server.

  • Cluster Restart: Always restart the cluster after adding new libraries.


Can You Override the Default Driver?

Yes, by uploading and attaching the newer JDBC and MSAL JARs, Databricks will use those instead of the built-in JDBC driver.