Databricks Community

akuma643 · ‎02-18-2025

Hi Team,

i am trying to connect to SQL server hosted in azure vm using Entra id authentication from Databricks.("authentication", "ActiveDirectoryManagedIdentity")

Below is the notebook script i am using.

driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver"

database_host = "xxxxx"

database_port = "1433" # update if you use a non-default port

database_name = "xxxx"

UserAssignedMiClientId="xxxxxx"

table = "CAS_DEF"

user = "<username>"

password = "<password>"

url = f"jdbc:sqlserver://{database_host}:{database_port};user={UserAssignedMiClientId};authentication=ActiveDirectoryManagedIdentity;encrypt=true;trustServerCertificate=true;database={database_name}"

remote_table = (spark.read

.format("sqlserver")

.option("host", "xxxx")

.option("port", "1433") # optional, can use default port 1433 if omitted

.option("user", {UserAssignedMiClientId})

.option("authentication", "ActiveDirectoryManagedIdentity")

.option("database", {database_name})

.option("dbtable", {table})

.load()

)

I am getting the exception as below,

com.microsoft.sqlserver.jdbc.SQLServerException: The authentication value "ActiveDirectoryManagedIdentity" is not valid. at com.microsoft.sqlserver.jdbc.SqlAuthentication.valueOfString(SQLServerDriver.java:96) at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:2325) at com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:1663) at com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1064) at org.apache.spark.sql.execution.datasources.jdbc.connection.BasicConnectionProvider.getConnection(BasicConnectionProvider.scala:50)

i think Databricks is using default Sqlserverdriver. Can i override that???

Please help us on resolving the above issue.

akuma643 · ‎02-19-2025

Can anyone help me out on this please

mark_ott · ‎10-31-2025

You are encountering an error because the default SQL Server JDBC driver bundled with Databricks may not fully support the authentication value "ActiveDirectoryManagedIdentity"—this option requires at least version 10.2.0 of the Microsoft SQL Server JDBC driver, and often custom configuration in Databricks to properly reference the newer driver and required authentication libraries.

Why This Happens

Default Driver Limitation: The default Databricks SQL Server connector uses an older Microsoft JDBC driver, which does not support the "ActiveDirectoryManagedIdentity" authentication method.
Custom Driver Needed: To use managed identity authentication (especially with a User Assigned Managed Identity), you must upload a newer version of mssql-jdbc (at least v10.2.0 or later) and the Microsoft Authentication Library (MSAL) JARs to your Databricks cluster.

Steps to Override the SQL Server JDBC Driver in Databricks

1. Download Required JAR Files

Download the following JARs:

Microsoft JDBC Driver for SQL Server (mssql-jdbc-10.2.0.jre8.jar or later)
Microsoft Authentication Library (MSAL) (msal4j-1.x.x.jar)
Dependent libraries (included with the driver or required by MSAL)

2. Upload JARs to Databricks

Go to Databricks Workspace > Clusters > your-cluster > Libraries.
Click Install New and upload the driver and MSAL JAR files.

3. Configure Your Notebook/Cluster

Attach the uploaded libraries to your Databricks cluster.
Restart the cluster to ensure the new driver is used.

4. Notebook Script Update

You do not specify user or password—you only need the client id for your Managed Identity.
Also, make sure the options reference the right values.

python

driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
database_host = "xxxxx"
database_port = "1433"
database_name = "xxxx"
UserAssignedMiClientId = "xxxxxx"  # This is the client id of your managed identity
table = "CAS_DEF"

jdbc_url = (
  f"jdbc:sqlserver://{database_host}:{database_port};"
  f"database={database_name};"
  f"authentication=ActiveDirectoryManagedIdentity;"
  f"msiClientId={UserAssignedMiClientId};"
  "encrypt=true;trustServerCertificate=true"
)

df = (spark.read
      .format("jdbc")
      .option("url", jdbc_url)
      .option("dbtable", table)
      .option("driver", driver)
      .load())

Key Points:

Use "jdbc" (not "sqlserver") as the format.
Provide the full jdbc_url via the .option("url", ...).
Omit explicit user and password.

5. Databricks Permissions

Ensure your Databricks cluster or workspace is assigned the User Assigned Managed Identity, and that this identity has Azure AD and SQL Server permissions.

Troubleshooting

Driver Not Found: If you still get errors about the authentication type, double-check that the right driver version is installed and attached to the cluster.
Permission Issues: Make sure the managed identity has the correct roles in Azure and in SQL Server.
Cluster Restart: Always restart the cluster after adding new libraries.

Can You Override the Default Driver?

Yes, by uploading and attaching the newer JDBC and MSAL JARs, Databricks will use those instead of the built-in JDBC driver.

Marco37 · 4 weeks ago

Hi Mark,

I have tried to use the option "ActiveDirectoryManagedIdentity", but didn't succeed.

I have downloaded the .jar files and installed them in the libraries tab of my cluster. Then I started the cluster. The cluster logfiles show that the .jar files are installed. The runtime of the cluster is "17.3 LTS (includes Apache Spark 4.0.0, Scala 2.13)"

When I run the python script below I receive the error "The authentication value "ActiveDirectoryManagedIdentity" is not valid"

If I run the same script, but with the authentication option "ActiveDirectoryMSI" it runs without issues

I did read that it does not really matter which authentication method you use, because they are actually the same "ActiveDirectoryManagedIdentity" is the new name for Azure Managed Identity authentication instead of "ActiveDirectoryMSI".
Difference between Active Directory Managed Identity and MSI in Azure SQL Database - Microsoft Q&A

But I still wonder, why the option "ActiveDirectoryManagedIdentity" does work for you and not for me.

Kind regards,
Marco