Connect Databricks to Airflow

T_I
New Contributor II

Hi,

I have Databricks on top of aws. I have a Databricks connection on Airflow (mwaa). I am able to conect and execute a Datbricks job via Airflow using a personal access token. I believe the best practice is to conect using a service principal. I understand that I should use the connection id and the secret in order to conect but I get error 401 which I believe it is a result of incorrect Oauth M2M.

Can someone share a light on how it should be done?

 

Thanks.

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @T_I ,

Have you followed this guide? https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html

You will have to use these 2 parameters from the service principal:

    • DATABRICKS_CLIENT_ID: The service principal’s client ID.
    • DATABRICKS_CLIENT_SECRET: The service principal’s secret.

T_I
New Contributor II

I have followed this guide but I do not understand how to implement step 4 - OAuth M2M authentication...
If you could assist, it would be greatly appreciated.

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @T_I,

Instead of the PAT token you have to specify the below settings to be able to use the Service Principal:

For workspace-level operations, set the following environment variables:

T_I
New Contributor II

Where should I specify those settings?