cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

WorkspaceClient authentication fails when running on a Docker cluster

JohanS
New Contributor III

from databricks.sdk import WorkspaceClient
w = WorkspaceClient()

ValueError: default auth: cannot configure default credentials ...

I'm trying to instantiate a WorkspaceClient in a notebook on a cluster running a Docker image, but authentication fails.
The Docker image is based on databricksruntime/python:13.3-LTS, and databricks-sdk 0.24.0 is installed during build.

Running the same code on a non-Docker cluster works, after updating the databricks-sdk.
What else can I do to make this work?

1 ACCEPTED SOLUTION

Accepted Solutions

Srihasa_Akepati
New Contributor III
New Contributor III

@JohanS As discussed, Default auth from a notebook using sdk on DCS is yet to be tested by Engineering. Please use PAT auth for now. I will keep you posted on the progress default auth on DCS.  

View solution in original post

2 REPLIES 2

Kaniz
Community Manager
Community Manager
Hi @JohanS, It seems you’re encountering an authentication issue when trying to instantiate a WorkspaceClient in a Docker image running Databricks.
 
Let’s troubleshoot this! 😊

The error message you’re seeing, “default auth: cannot configure default credentials,” typically occurs when the Databricks SDK is unable to find valid credentials for authentication.

Here are some steps you can take to resolve this:

  1. Check Your Environment Variables:

    • Ensure that the environment variables DATABRICKS_HOST and DATABRICKS_TOKEN are correctly set. These variables are used by the Databricks CLI for authentication.
    • Verify that the token you’re using is valid and has the necessary permissions. Personal access tokens are commonly used for authentication.
  2. GitLab Pipeline Considerations:

    • You mentioned that the issue occurs within a GitLab pipeline. It’s possible that interactions within the pipeline are affecting authentication.
    • Make sure that the environment variables are correctly propagated within the pipeline. Double-check how you’re setting these variables and whether they are accessible during the pipeline execution.
  3. Debugging the Pipeline:

    • Review your GitLab pipeline script. Ensure that the Databricks CLI is properly initialized with the personal token and that the DATABRICKS_HOST environment variable is correctly set.
    • Consider adding debugging statements to print out the values of relevant variables during pipeline execution. This can help identify any unexpected behaviour.
  4. Permissions and Managed Identity:

  5. Update Databricks SDK:

    • You mentioned that the same code works on a non-Docker cluster after updating the Databricks SDK. Consider updating the SDK version within your Docker image as well.
    • Sometimes, older SDK versions may have compatibility issues or bugs that are resolved in newer releases.
  6. Verify Token and Host:

    • Double-check that the token and host information in your configuration file (databrickscfg) match the correct values.
    • Confirm that the token is correctly written in the configuration file and that it corresponds to the expected workspace.

Good luck, and I hope you get it working! 🚀

 

Srihasa_Akepati
New Contributor III
New Contributor III

@JohanS As discussed, Default auth from a notebook using sdk on DCS is yet to be tested by Engineering. Please use PAT auth for now. I will keep you posted on the progress default auth on DCS.