Generate longer token for Databricks with Azure.

Etyr
Contributor II

I'm using DefaultAzureCredential from azure-identity to connect to Azure with service principal environment variables (AZURE_CLIENT_SECRET, AZURE_TENANT_ID, AZURE_CLIENT_ID).

I can get_token from a specific scope for databricks like this:

from azure.identity import DefaultAzureCredential
 
dbx_scope = "2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default"
token = DefaultAzureCredential().get_token(dbx_scope).token

So this is working great, I get the token, and then I can use `databricks-connect` to configure my connection to the cluster. This generates me a configuration ($HOME/.databricks-connect) file for Spark to know where to connect and use the given token.

{
  "host": "https://adb-1234.azuredatabricks.net",
  "token": "eyJ0eXAiXXXXXXXXXXXXXXXXXXXXXx",
  "cluster_id": "1234",
  "org_id": "1234",
  "port": "15001"
}

The issue is that this token does not last very long. When I use spark for more than an hour, I get disconnected because the token is expired.

Is there a way to get a longer token for databricks with a Service Principal ? Since this aim to be for production, I wish my code could generate a PAT for any run, I don't want to create a PAT manually and store it to an Azure Key Vault.