Databricks Community

Marco37 · Monday

Good day,

I need some help with automating a connection from databricks to an Azure SQL Database. I'am able to configure the connection with the UI (Catalog Explorer), but I also want to configure it with a REST API (or SQL script), so that I can integrate it in my CI/CD pipeline.

- I have created an App Registration in Entra and created a secret
- The Azure SQL Database is configured for Entra Authentication Only
- I use the U2M (OAuth) method

I use the REST API "/api/2.1/unity-catalog/connections" with the JSON file below.

{
  "name": "testu2m",
  "connection_type": "SQLSERVER",
  "options": {
    "applicationIntent": "ReadOnly",
    "host": "sqlserver######.database.windows.net",
    "oauth_redirect_uri": "https://#########.azuredatabricks.net/login/oauth/azure.html",
    "client_id": "#########",
    "client_secret": "###########",
    "port": "1433",
    "authorization_endpoint": "https://login.microsoftonline.com/#############/oauth2/v2.0/authorize",
    "trustServerCertificate": "true",
    "oauth_scope": "https://database.windows.net/.default offline_access"
  }
}

When I call the REST API with this JSON file I receive this error:

| { "error_code": "INVALID_PARAMETER_VALUE", "message":
| "CONNECTION/CONNECTION_SQLSERVER_OAUTH_U2M must include the following
| option(s): pkce_verifier,authorization_code.", "details": [ {
| "@type": "type.googleapis.com/google.rpc.ErrorInfo", "reason":
| "MISSING_CONNECTION_OPTION", "domain":
| "unity-catalog.databricks.com", "metadata": {
| "securable_type": "CONNECTION", "securable_kind":
| "CONNECTION_SQLSERVER_OAUTH_U2M", "options":
| "pkce_verifier,authorization_code" } }, { "@type":
| "type.googleapis.com/google.rpc.RequestInfo", "request_id":
| "55f708b4-866e-4b97-8e85-b898bba16ac0", "serving_data": "" }
| ] }

It looks like I also need to supply the options or properties pkce_verifier and authorization_code, but I do not know where I can find this information.

Kind Regards,
Marco

nayan_wylde · Monday

Since U2M authorization_code is user-consent bound, full automation is tricky. The recommended pattern for CI/CD is:

Use a Service Principal connection (machine-to-machine, “M2M”), not U2M.
That avoids any manual OAuth dance — you just store the client_id + client_secret + tenant_id.

Here’s a minimal M2M connection JSON example:

{
  "name": "sql-m2m-conn",
  "connection_type": "SQLSERVER",
  "options": {
    "host": "sqlserver######.database.windows.net",
    "port": "1433",
    "trustServerCertificate": "true",
    "oauth_type": "M2M",
    "client_id": "<service_principal_client_id>",
    "client_secret": "<service_principal_secret>",
    "tenant_id": "<tenant_id>",
    "oauth_scope": "https://database.windows.net/.default"
  }
}

This is the recommended method to integrate in CI/CD

View solution in original post

bianca_unifeye · Tuesday

Hi @Marco37 Marco, the error you’re seeing is expected for U2M (user-to-machine) connections. This flow requires an interactive OAuth PKCE process, where you must first obtain two one-time values before calling the API:

authorization_code returned to your redirect URI after the user signs in
pkce_verifier the random string you initially generated when starting the PKCE flow

Without these values, the API call will fail. You’ll need to perform the authorization step to generate the code and then use it immediately, as the authorization code is single-use and short-lived.

Once created, the U2M connection allows interactive users to query Azure SQL through Lakehouse Federation using their delegated Entra ID credentials.

That said, I’m curious, is there a specific reason you prefer U2M over M2M (service principal)?
The M2M setup is generally simpler and more suitable for automation, as it avoids the manual authorization flow altogether.

View solution in original post

nayan_wylde · Monday

Since U2M authorization_code is user-consent bound, full automation is tricky. The recommended pattern for CI/CD is:

Use a Service Principal connection (machine-to-machine, “M2M”), not U2M.
That avoids any manual OAuth dance — you just store the client_id + client_secret + tenant_id.

Here’s a minimal M2M connection JSON example:

{
  "name": "sql-m2m-conn",
  "connection_type": "SQLSERVER",
  "options": {
    "host": "sqlserver######.database.windows.net",
    "port": "1433",
    "trustServerCertificate": "true",
    "oauth_type": "M2M",
    "client_id": "<service_principal_client_id>",
    "client_secret": "<service_principal_secret>",
    "tenant_id": "<tenant_id>",
    "oauth_scope": "https://database.windows.net/.default"
  }
}

This is the recommended method to integrate in CI/CD

bianca_unifeye · Monday

Hi Marco,

The error you’re seeing

"must include pkce_verifier, authorization_code"

happens because you’re using the OAuth U2M (User-to-Machine) flow. That method requires an interactive PKCE authorization (browser sign-in) to generate authorization_code and pkce_verifier, so it isn’t suitable for automation or CI/CD.

Since your goal is to read data and integrate this into a CI/CD pipeline, I’d recommend switching to a Service Principal (M2M) setup instead. It uses the client credentials flow, no PKCE, no user interaction, and works perfectly with Databricks Lakehouse Federation (for query-in-place) or LakeBridge (for bulk migration).

Marco37 · Monday

Hi Nayan and Bianca,

Thanks for your replies.

The CI/CD pipeline I mentioned is the deployment pipeline I use to create/configure the connection. Not a pipeline that will use the connection (I hadn't thought of that yet, so your replies are very welcome).

So I want to create a U2M connection, which interactive users can use to query the Azure SQL database from their notebooks. I can do this within the Catalog Explorer, but do not succeed in creating it with the REST API (or a SQL script).

Kind Regards,
Marco

bianca_unifeye · Tuesday

Hi @Marco37 Marco, the error you’re seeing is expected for U2M (user-to-machine) connections. This flow requires an interactive OAuth PKCE process, where you must first obtain two one-time values before calling the API:

authorization_code returned to your redirect URI after the user signs in
pkce_verifier the random string you initially generated when starting the PKCE flow

Without these values, the API call will fail. You’ll need to perform the authorization step to generate the code and then use it immediately, as the authorization code is single-use and short-lived.

Once created, the U2M connection allows interactive users to query Azure SQL through Lakehouse Federation using their delegated Entra ID credentials.

That said, I’m curious, is there a specific reason you prefer U2M over M2M (service principal)?
The M2M setup is generally simpler and more suitable for automation, as it avoids the manual authorization flow altogether.

Marco37 · Tuesday

Hi Bianca,

Thanks for your help. If I understand correctly the "authorization_code" and "pkce_verifier" are normally generated by the button "Sign in with Azure Entra ID" when I configure a connection through the Catalog Explorer.

My organization is new to Databricks, so every question of our customers is a challenge for us. I tried to configure an U2M connection because I thought it was more secure to allow the users to connect the database with their own credentials, so I can configure database permissions in the database (which user is allowed to query which table). If I use the M2M method, everybody has the same permissions (the permissions of the service pricipal).

But M2M is indeed more suitable for automation and probably all users get the same permissions anyway.

Kind Regards,
Marco