cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to upload a wheel file in Azure DevOps pipeline

vvk
New Contributor II

Hi, I am trying to upload a wheel file to Databricks workspace using Azure DevOps release pipeline to use it in the interactive cluster. I tried "databricks workspace import" command, but looks like it does not support .whl files. Hence, I tried to upload the wheel file to a unity catalog volume using "databricks fs cp" command. 

It is working in my local cli set up, but failing in the DevOps pipeline with authorization error  "Authorization failed. Your token may be expired or lack the valid scope". I am using the access token of a SP that has full access to the catalog in both DevOps pipeline and the local CLI set up. It is working fine in the local CLI but failing in DevOps pipeline. Any ideas would be greatly appreciated.

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @vvkUploading Python wheel files to an Azure Databricks workspace via an Azure DevOps release pipeline involves a few steps.

Let’s troubleshoot the issue you’re facing:

  1. Authorization Error: The “Authorization failed” error you’re encountering could be due to several reasons. Let’s explore some potential solutions:

    • Token Scope: Ensure that the service principal (SP) token you’re using has the necessary permissions. Specifically, it should have the appropriate scope to interact with the Databricks workspace, including access to the catalog and the ability to upload files. Double-check that the token has the correct permissions.

    • Token Expiry: Verify that the SP token is not expired. Tokens typically have a limited lifespan, so ensure that it’s still valid.

    • Workspace Access Control: Confirm that the SP has access to the specific catalog volume you’re trying to upload the wheel file to. Sometimes, access control settings can differ between local CLI and DevOps pipelines.

  2. Azure DevOps Pipeline Configuration: Let’s review your Azure DevOps pipeline configuration:

    • Service Connection: Ensure that the service connection in your Azure DevOps pipeline is correctly set up. It should use the same SP credentials that work in your local CLI.

    • Pipeline Variables: Check if any pipeline variables (such as the SP token) are correctly configured. Sometimes, environment-specific variables can cause issues.

    • Pipeline Agent: Verify that the pipeline agent running the job has network access to the Databricks workspace. If it’s a self-hosted agent, ensure it’s properly configured.

  3. Databricks CLI Commands: You mentioned using the databricks fs cp command.

    1. Here’s how you can use it to upload a wheel file:

      databricks fs cp local-path-to-wheel.whl dbfs:/mnt/catalog-volume/wheel-files/
      

      Replace local-path-to-wheel.whl with the actual path to your wheel file and adjust the DBFS path accordingly.

  4. CI/CD Workflow: Consider following the recommended CI/CD workflow for Databricks with Azure DevOps:

    • Set up a Git repository.
    • Develop and test artifacts locally.
    • Push changes to the repository.
    • Configure Azure DevOps to automatically trigger builds and deployments based on repository events (e.g., pull requests).
  5. Alternative Approach: Instead of using the Databricks CLI, consider using the Databricks REST API directly from your Azure DevOps pipeline. This approach allows more flexibility and control over the deployment process.

If the issue persists, check the logs in your DevOps pipeline for more detailed error messages.

Feel free to provide additional details, and we can continue troubleshooting! 🚀

For more information, you can refer to the official documentation on CI/CD with Azure Databricks1.

 

vvk
New Contributor II

Hi,

I don't think there is any other issue with the pipeline set up as I am able to perform other actions successfully (e.g. import notebooks using databricks workspace import_dir). Only fs cp to the volume is throwing the authentication error. I double checked again and can confirm that the user has full access to the catalog where the volume resides. Debug shows that the following API call is throwing HTTP 403 error.

 


GET /api/2.0/dbfs/get-status?path=dbfs:/Volumes/<catalog_name>/bronze/libraries
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.