cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Use a Service Principal Token instead of Personal Access Token for Databricks Asset Bundle

PabloCSD
Contributor III

How can I connect using a Service Principal Token, I did this, but it is not a PAT:

 

databricks configure
Databricks host: https:// ...
Personal access token: ****

 

 I also tried this, but didn't work either:

 

[profile]
host = <workspace-url>
client_id = <service-principal-client-id>
client_secret = <service-principal-secret>

 

 I tried this way, but nothing (just in case):

 

databricks configure --aad-token

 

 How can I configure my Databricks workspace so I can deploy DAB in that workspace, but using the service-principal token (for no relying in PAT's)

 

Best regards, and thank you

#DAB #DatabricksAssetsBundle #ServicePrincipal

1 ACCEPTED SOLUTION

Accepted Solutions

PabloCSD
Contributor III

Thanks Pedro, we did it, for anyone in the future (I added fake host and service principal id's):

1. Modify your databricks.yml so it have the service principal id and the databricks host:

 

bundle:
  name: my_workflow

# Declare to Databricks Assets Bundles that this is a Python project
# This is the interaction with the "pyproject.toml" file
artifacts:
  default:
    type: whl
    build: poetry build
    path: .

resources:
  jobs:
    my_workflow:
      name: my_workflow
      job_clusters:
        - job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
          new_cluster:
                num_workers: 2
                spark_version: "15.3.x-cpu-ml-scala2.12"  
                node_type_id: Standard_DS3_v2      
      tasks:
        - task_key: my_workflow_pipeline_task
          job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
          python_wheel_task:
             package_name: my_workflow
             entry_point: my_workflow_pipeline_task
          libraries:
            - whl: ./dist/*.whl
      permissions:
        # If you are using a group, you need to create it in the Databricks workspace
        - group_name: "my_group_name"
          level: "CAN_MANAGE"

targets:
  dev:
    mode: development
    default: true
    workspace: 
      # Put here the associated workspace url
      host: https://adb-0000000000000000.7.azuredatabricks.net
    run_as:
      # Put here the associated service_principal_name
      service_principal_name: 76w4hdge-39a2-0303-45c7-udnr93kvp03f
    resources:
      jobs:
        my_workflow:
          job_clusters:
            - job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
              new_cluster:
                num_workers: 2
                spark_version: "15.3.x-cpu-ml-scala2.12"  
                node_type_id: Standard_DS3_v2
          permissions:
          # If you are using a group, you need to create it in the Databricks workspace
          - group_name: "my_group_name"
            level: "CAN_MANAGE"

 

2. Create a .databrickscfg file in the same route where your databricks-cli is installed, so it has the following information:

 

[my_workflow]
host=https://adb-0000000000000000.7.azuredatabricks.net/
client_id = 76w4hdge-39a2-0303-45c7-udnr93kvp03f
client_secret = tomatoes***************spinach

 

3. In the terminal just run:

 

databricks bundle deploy --profile my_workflow

 

If all was done correctly this should be the output:

 

(.venv) oishiiramen@3301 my_directory % databricks bundle deploy --profile my_workflow
Building default...
Uploading my_workflow-0.1.1-py3-none-any.whl...
Uploading bundle files to /Users/76w4hdge-39a2-0303-45c7-udnr93kvp03f/.bundle/my_workflow/dev/files...
Deploying resources...
Updating deployment state...
Deployment complete!

 

If the .databrickscfg was not created this could appear:

 

(.venv) oishiiramen@3301 my_directory % databricks bundle deploy --profile my_workflow
Error: cannot resolve bundle auth configuration: cannot parse config file: open /Users/oishiiramen/.databrickscfg: no such file or directory

 

 

View solution in original post

3 REPLIES 3

dataeng42io
New Contributor III

Hi @PabloCSD,

Long story short you can watch this video where I go step by step on how to set up service principal in azure, grant permissions to workspace and generate a token to itself by doing a machine to machine authentication in the Databricks CLI.

The steps that you need to take to deploy your bundle using service principle.

1.Add service principal to your Databricks Account

2.Give that service principal administration rights to the workspace you want to deploy the DAB

3. Generate a PAT (personal access token) to the service principal. Which you can do in 2 ways.

3.a Either via a Machine to Machine authentication where the service principal generate a PAT to itself. I demonstrate this in the video

3,b or you can generate a token to the sp by using the "on behalf" option providing the principal generating the token has at least the workspace administration writes. On this post there is a solution for this option. 

 

To deploy your bundle using the cli you will use the command

databricks bundle deploy -t <target-name> -p <sp-profile>

The service principle profile needs to have your service principal configured on your ~/.databrickscfg file either with a machine to machine (oauth token) or pat (personal access token)

Hope this helps.

Let me know if you can solve your issue. If any other questions I am here to help

Regards

Pedro

dataeng42io
New Contributor III

PabloCSD
Contributor III

Thanks Pedro, we did it, for anyone in the future (I added fake host and service principal id's):

1. Modify your databricks.yml so it have the service principal id and the databricks host:

 

bundle:
  name: my_workflow

# Declare to Databricks Assets Bundles that this is a Python project
# This is the interaction with the "pyproject.toml" file
artifacts:
  default:
    type: whl
    build: poetry build
    path: .

resources:
  jobs:
    my_workflow:
      name: my_workflow
      job_clusters:
        - job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
          new_cluster:
                num_workers: 2
                spark_version: "15.3.x-cpu-ml-scala2.12"  
                node_type_id: Standard_DS3_v2      
      tasks:
        - task_key: my_workflow_pipeline_task
          job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
          python_wheel_task:
             package_name: my_workflow
             entry_point: my_workflow_pipeline_task
          libraries:
            - whl: ./dist/*.whl
      permissions:
        # If you are using a group, you need to create it in the Databricks workspace
        - group_name: "my_group_name"
          level: "CAN_MANAGE"

targets:
  dev:
    mode: development
    default: true
    workspace: 
      # Put here the associated workspace url
      host: https://adb-0000000000000000.7.azuredatabricks.net
    run_as:
      # Put here the associated service_principal_name
      service_principal_name: 76w4hdge-39a2-0303-45c7-udnr93kvp03f
    resources:
      jobs:
        my_workflow:
          job_clusters:
            - job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
              new_cluster:
                num_workers: 2
                spark_version: "15.3.x-cpu-ml-scala2.12"  
                node_type_id: Standard_DS3_v2
          permissions:
          # If you are using a group, you need to create it in the Databricks workspace
          - group_name: "my_group_name"
            level: "CAN_MANAGE"

 

2. Create a .databrickscfg file in the same route where your databricks-cli is installed, so it has the following information:

 

[my_workflow]
host=https://adb-0000000000000000.7.azuredatabricks.net/
client_id = 76w4hdge-39a2-0303-45c7-udnr93kvp03f
client_secret = tomatoes***************spinach

 

3. In the terminal just run:

 

databricks bundle deploy --profile my_workflow

 

If all was done correctly this should be the output:

 

(.venv) oishiiramen@3301 my_directory % databricks bundle deploy --profile my_workflow
Building default...
Uploading my_workflow-0.1.1-py3-none-any.whl...
Uploading bundle files to /Users/76w4hdge-39a2-0303-45c7-udnr93kvp03f/.bundle/my_workflow/dev/files...
Deploying resources...
Updating deployment state...
Deployment complete!

 

If the .databrickscfg was not created this could appear:

 

(.venv) oishiiramen@3301 my_directory % databricks bundle deploy --profile my_workflow
Error: cannot resolve bundle auth configuration: cannot parse config file: open /Users/oishiiramen/.databrickscfg: no such file or directory

 

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group