Databricks Community

PabloCSD · ‎09-24-2024

How can I connect using a Service Principal Token, I did this, but it is not a PAT:

databricks configure
Databricks host: https:// ...
Personal access token: ****

I also tried this, but didn't work either:

[profile]
host = <workspace-url>
client_id = <service-principal-client-id>
client_secret = <service-principal-secret>

I tried this way, but nothing (just in case):

databricks configure --aad-token

How can I configure my Databricks workspace so I can deploy DAB in that workspace, but using the service-principal token (for no relying in PAT's)

Best regards, and thank you

#DAB #DatabricksAssetsBundle #ServicePrincipal

PabloCSD · ‎09-26-2024

Thanks Pedro, we did it, for anyone in the future (I added fake host and service principal id's):

1. Modify your databricks.yml so it have the service principal id and the databricks host:

bundle:
  name: my_workflow

# Declare to Databricks Assets Bundles that this is a Python project
# This is the interaction with the "pyproject.toml" file
artifacts:
  default:
    type: whl
    build: poetry build
    path: .

resources:
  jobs:
    my_workflow:
      name: my_workflow
      job_clusters:
        - job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
          new_cluster:
                num_workers: 2
                spark_version: "15.3.x-cpu-ml-scala2.12"  
                node_type_id: Standard_DS3_v2      
      tasks:
        - task_key: my_workflow_pipeline_task
          job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
          python_wheel_task:
             package_name: my_workflow
             entry_point: my_workflow_pipeline_task
          libraries:
            - whl: ./dist/*.whl
      permissions:
        # If you are using a group, you need to create it in the Databricks workspace
        - group_name: "my_group_name"
          level: "CAN_MANAGE"

targets:
  dev:
    mode: development
    default: true
    workspace: 
      # Put here the associated workspace url
      host: https://adb-0000000000000000.7.azuredatabricks.net
    run_as:
      # Put here the associated service_principal_name
      service_principal_name: 76w4hdge-39a2-0303-45c7-udnr93kvp03f
    resources:
      jobs:
        my_workflow:
          job_clusters:
            - job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
              new_cluster:
                num_workers: 2
                spark_version: "15.3.x-cpu-ml-scala2.12"  
                node_type_id: Standard_DS3_v2
          permissions:
          # If you are using a group, you need to create it in the Databricks workspace
          - group_name: "my_group_name"
            level: "CAN_MANAGE"

2. Create a .databrickscfg file in the same route where your databricks-cli is installed, so it has the following information:

[my_workflow]
host=https://adb-0000000000000000.7.azuredatabricks.net/
client_id = 76w4hdge-39a2-0303-45c7-udnr93kvp03f
client_secret = tomatoes***************spinach

3. In the terminal just run:

databricks bundle deploy --profile my_workflow

If all was done correctly this should be the output:

(.venv) oishiiramen@3301 my_directory % databricks bundle deploy --profile my_workflow
Building default...
Uploading my_workflow-0.1.1-py3-none-any.whl...
Uploading bundle files to /Users/76w4hdge-39a2-0303-45c7-udnr93kvp03f/.bundle/my_workflow/dev/files...
Deploying resources...
Updating deployment state...
Deployment complete!

If the .databrickscfg was not created this could appear:

(.venv) oishiiramen@3301 my_directory % databricks bundle deploy --profile my_workflow
Error: cannot resolve bundle auth configuration: cannot parse config file: open /Users/oishiiramen/.databrickscfg: no such file or directory

View solution in original post

dataeng42io · ‎09-24-2024

Hi @PabloCSD,

Long story short you can watch this video where I go step by step on how to set up service principal in azure, grant permissions to workspace and generate a token to itself by doing a machine to machine authentication in the Databricks CLI.

The steps that you need to take to deploy your bundle using service principle.

1.Add service principal to your Databricks Account

2.Give that service principal administration rights to the workspace you want to deploy the DAB

3. Generate a PAT (personal access token) to the service principal. Which you can do in 2 ways.

3.a Either via a Machine to Machine authentication where the service principal generate a PAT to itself. I demonstrate this in the video

3,b or you can generate a token to the sp by using the "on behalf" option providing the principal generating the token has at least the workspace administration writes. On this post there is a solution for this option.

To deploy your bundle using the cli you will use the command

databricks bundle deploy -t <target-name> -p <sp-profile>

The service principle profile needs to have your service principal configured on your ~/.databrickscfg file either with a machine to machine (oauth token) or pat (personal access token)

Hope this helps.

Let me know if you can solve your issue. If any other questions I am here to help

Regards

Pedro

dataeng42io · ‎09-24-2024

Just adding the documentation about authentication -> https://learn.microsoft.com/en-us/azure/databricks/dev-tools/cli/authentication#m2m-auth

PabloCSD · ‎09-26-2024

Thanks Pedro, we did it, for anyone in the future (I added fake host and service principal id's):

1. Modify your databricks.yml so it have the service principal id and the databricks host:

bundle:
  name: my_workflow

# Declare to Databricks Assets Bundles that this is a Python project
# This is the interaction with the "pyproject.toml" file
artifacts:
  default:
    type: whl
    build: poetry build
    path: .

resources:
  jobs:
    my_workflow:
      name: my_workflow
      job_clusters:
        - job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
          new_cluster:
                num_workers: 2
                spark_version: "15.3.x-cpu-ml-scala2.12"  
                node_type_id: Standard_DS3_v2      
      tasks:
        - task_key: my_workflow_pipeline_task
          job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
          python_wheel_task:
             package_name: my_workflow
             entry_point: my_workflow_pipeline_task
          libraries:
            - whl: ./dist/*.whl
      permissions:
        # If you are using a group, you need to create it in the Databricks workspace
        - group_name: "my_group_name"
          level: "CAN_MANAGE"

targets:
  dev:
    mode: development
    default: true
    workspace: 
      # Put here the associated workspace url
      host: https://adb-0000000000000000.7.azuredatabricks.net
    run_as:
      # Put here the associated service_principal_name
      service_principal_name: 76w4hdge-39a2-0303-45c7-udnr93kvp03f
    resources:
      jobs:
        my_workflow:
          job_clusters:
            - job_cluster_key: ${bundle.target}-${bundle.name}-job-cluster
              new_cluster:
                num_workers: 2
                spark_version: "15.3.x-cpu-ml-scala2.12"  
                node_type_id: Standard_DS3_v2
          permissions:
          # If you are using a group, you need to create it in the Databricks workspace
          - group_name: "my_group_name"
            level: "CAN_MANAGE"

2. Create a .databrickscfg file in the same route where your databricks-cli is installed, so it has the following information:

[my_workflow]
host=https://adb-0000000000000000.7.azuredatabricks.net/
client_id = 76w4hdge-39a2-0303-45c7-udnr93kvp03f
client_secret = tomatoes***************spinach

3. In the terminal just run:

databricks bundle deploy --profile my_workflow

If all was done correctly this should be the output:

(.venv) oishiiramen@3301 my_directory % databricks bundle deploy --profile my_workflow
Building default...
Uploading my_workflow-0.1.1-py3-none-any.whl...
Uploading bundle files to /Users/76w4hdge-39a2-0303-45c7-udnr93kvp03f/.bundle/my_workflow/dev/files...
Deploying resources...
Updating deployment state...
Deployment complete!

If the .databrickscfg was not created this could appear:

(.venv) oishiiramen@3301 my_directory % databricks bundle deploy --profile my_workflow
Error: cannot resolve bundle auth configuration: cannot parse config file: open /Users/oishiiramen/.databrickscfg: no such file or directory

Databricks Community

Use a Service Principal Token instead of Personal Access Token for Databricks Asset Bundle

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!