cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Error: default auth: cannot configure default credentials, please check...

6502
New Contributor III

Hola all, I'm experiencing a quite strange error. 

The problem is that and happens inside a GITLAB pipeline:

$ databricks current-user me

Error: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://XXXXXXXXX.databricks.com. Env: DATABRICKS_HOST

The curious fact is that the same command works fine within a local docker image.

The Token used is a personal token, and it is correctly propagated into the pipeline.  

Did anybody else get the same problem? Any clue is welcomed

1 ACCEPTED SOLUTION

Accepted Solutions

6502
New Contributor III

Hola Kaniz, 

the problem is not on Databricks CLI but is due to some interactions happening inside the Gitlab pipeline. 

According to the documentation reported here: 

Databricks personal access token authentication | Databricks on AWS ( at the bottom of the page ) 

Setting the ENV vars: DATABRICKS_HOST and DATABRICKS_TOKEN is just ONE possible option. Well, on the Gitlab pipeline I was working it was the ONLY option. 

It has been a hard debug that concluded with a lucky attempt from my colleague. Really a random & lucky attempt. 

It is hard to spot the problem, so let me share my experience with others in the hope they will find that useful and time-saving.

> My pipeline script initializes the databricks-cli by passing
> the personal token and by setting the DATABRICKS_HOST env var. 

$ if [ -z $TF_VAR_personal_token ]; then echo "Zero size TF_VAR_personal_token"; else echo $TF_VAR_personal_token | databricks configure $DEBUG; fi
12:06:19 INFO start pid=292 version=0.214.0 args="databricks, configure, --debug"
12:06:19 INFO Saving /root/.databrickscfg pid=292
12:06:19 INFO completed execution pid=292 exit_code=0

> And the procedure completes with no error.

$ echo "DATABRICKS_HOST=$DATABRICKS_HOST, DATABRICKS_PATH=$DATABRICKS_PATH, token md5 will follow"
DATABRICKS_HOST=https://xxxxxxx.cloud.databricks.com, DATABRICKS_PATH=/Volumes/main/default/datalake, token md5 will follow
$ echo $TF_VAR_personal_token | md5sum
2b7425a1a31e169cce7e149de25635f8 -
$ id
uid=0(root) gid=0(root) groups=0(root)

> The configuration file databrickscfg is correctly written with the passed information. Anything is working as expected.
$ cat /root/.databrickscfg | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
[DEFAULT]
host = https://xxxxxxx.cloud.databricks.com
token = xxxxxxxxxxxxx
$ databricks auth env | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
{
"env": {
"DATABRICKS_AUTH_TYPE": "pat",
"DATABRICKS_CONFIG_PROFILE": "DEFAULT",
"DATABRICKS_HOST": "https://xxxxxxx.cloud.databricks.com",
"DATABRICKS_TOKEN": "xxxxxxxxxxxxx"
}}

> Accessing the workspace is not a problem. Anything works so far.

$ curl -v -v -v 'https://xxxxxxx.cloud.databricks.com'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying xxx.xxx.xxx.xxx...
* Connected to xxxxxxx.cloud.databricks.com (xxx.xxx.xxx.xxx) port 443 (#0)
* ALPN: offers h2,http/1.1
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [88 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [155 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [15 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [4014 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server accepted h2
* Server certificate:
* subject: C=US; ST=California; L=San Francisco; O=Databricks Inc.; CN=*.cloud.databricks.com
* start date: Jan 8 00:00:00 2024 GMT
* expire date: Jan 6 23:59:59 2025 GMT
* subjectAltName: host "xxxxxxx.cloud.databricks.com" matched cert's "*.cloud.databricks.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
* SSL certificate verify ok.
} [5 bytes data]
* using HTTP/2
* h2h3 [:method: GET]
* h2h3 [:path: /]
* h2h3 [:scheme: https]
* h2h3 [:authority:xxxxxxx.cloud.databricks.com]
* h2h3 [user-agent: curl/7.88.1]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x55fe0bbc1c80)
} [5 bytes data]
> GET / HTTP/2
> Host: xxxxxxx.cloud.databricks.com
> user-agent: curl/7.88.1
> accept: */*
>
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* old SSL session ID is stale, removing
{ [5 bytes data]
< HTTP/2 303
< location: https://xxxxxxx.cloud.databricks.com/login.html
< vary: Accept-Encoding
< date: Wed, 21 Feb 2024 12:06:19 GMT
< server: databricks
<
{ [0 bytes data]
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Connection #0 to host xxxxxxx.cloud.databricks.com left intact

But there is no way to communicate correctly with the remote system.

$ databricks current-user me $DEBUG
12:06:20 INFO start pid=310 version=0.214.0 args="databricks, current-user, me, --debug"
Error: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST
12:06:20 ERROR failed execution pid=310 exit_code=1 error="default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST" 

Best of all, testing the whole procedure outside the GitLab pipeline works like a charm. Is it crazy, isn't it?

Well, we solved the problem by setting the DATABRICKS_TOKEN and stopping initializing the Databricks-cli the way we did.

 


 

View solution in original post

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @6502It seems youโ€™re encountering an issue with Databricks authentication within a GitLab pipeline. The error message indicates that the default credentials cannot be configured.

Letโ€™s explore some possible solutions:

  1. Pipeline Environment:

    • Ensure that the environment variables in your GitLab pipeline are correctly set. Specifically, check if DATABRICKS_HOST is properly configured.
    • Verify that the token youโ€™re using is correctly propagated within the pipeline.
  2. Authentication Configuration:

    • Refer to the Databricks documentation for guidance on configuring credentials. Make sure youโ€™ve followed the recommended steps.
    • If youโ€™re using a personal token, ensure that it has the necessary permissions to access Databricks resources.
  3. Dependencies and Workspace Creation:

  4. Personal Access Token:

6502
New Contributor III

Hola Kaniz, 

the problem is not on Databricks CLI but is due to some interactions happening inside the Gitlab pipeline. 

According to the documentation reported here: 

Databricks personal access token authentication | Databricks on AWS ( at the bottom of the page ) 

Setting the ENV vars: DATABRICKS_HOST and DATABRICKS_TOKEN is just ONE possible option. Well, on the Gitlab pipeline I was working it was the ONLY option. 

It has been a hard debug that concluded with a lucky attempt from my colleague. Really a random & lucky attempt. 

It is hard to spot the problem, so let me share my experience with others in the hope they will find that useful and time-saving.

> My pipeline script initializes the databricks-cli by passing
> the personal token and by setting the DATABRICKS_HOST env var. 

$ if [ -z $TF_VAR_personal_token ]; then echo "Zero size TF_VAR_personal_token"; else echo $TF_VAR_personal_token | databricks configure $DEBUG; fi
12:06:19 INFO start pid=292 version=0.214.0 args="databricks, configure, --debug"
12:06:19 INFO Saving /root/.databrickscfg pid=292
12:06:19 INFO completed execution pid=292 exit_code=0

> And the procedure completes with no error.

$ echo "DATABRICKS_HOST=$DATABRICKS_HOST, DATABRICKS_PATH=$DATABRICKS_PATH, token md5 will follow"
DATABRICKS_HOST=https://xxxxxxx.cloud.databricks.com, DATABRICKS_PATH=/Volumes/main/default/datalake, token md5 will follow
$ echo $TF_VAR_personal_token | md5sum
2b7425a1a31e169cce7e149de25635f8 -
$ id
uid=0(root) gid=0(root) groups=0(root)

> The configuration file databrickscfg is correctly written with the passed information. Anything is working as expected.
$ cat /root/.databrickscfg | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
[DEFAULT]
host = https://xxxxxxx.cloud.databricks.com
token = xxxxxxxxxxxxx
$ databricks auth env | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
{
"env": {
"DATABRICKS_AUTH_TYPE": "pat",
"DATABRICKS_CONFIG_PROFILE": "DEFAULT",
"DATABRICKS_HOST": "https://xxxxxxx.cloud.databricks.com",
"DATABRICKS_TOKEN": "xxxxxxxxxxxxx"
}}

> Accessing the workspace is not a problem. Anything works so far.

$ curl -v -v -v 'https://xxxxxxx.cloud.databricks.com'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying xxx.xxx.xxx.xxx...
* Connected to xxxxxxx.cloud.databricks.com (xxx.xxx.xxx.xxx) port 443 (#0)
* ALPN: offers h2,http/1.1
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [88 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [155 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [15 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [4014 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server accepted h2
* Server certificate:
* subject: C=US; ST=California; L=San Francisco; O=Databricks Inc.; CN=*.cloud.databricks.com
* start date: Jan 8 00:00:00 2024 GMT
* expire date: Jan 6 23:59:59 2025 GMT
* subjectAltName: host "xxxxxxx.cloud.databricks.com" matched cert's "*.cloud.databricks.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
* SSL certificate verify ok.
} [5 bytes data]
* using HTTP/2
* h2h3 [:method: GET]
* h2h3 [:path: /]
* h2h3 [:scheme: https]
* h2h3 [:authority:xxxxxxx.cloud.databricks.com]
* h2h3 [user-agent: curl/7.88.1]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x55fe0bbc1c80)
} [5 bytes data]
> GET / HTTP/2
> Host: xxxxxxx.cloud.databricks.com
> user-agent: curl/7.88.1
> accept: */*
>
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* old SSL session ID is stale, removing
{ [5 bytes data]
< HTTP/2 303
< location: https://xxxxxxx.cloud.databricks.com/login.html
< vary: Accept-Encoding
< date: Wed, 21 Feb 2024 12:06:19 GMT
< server: databricks
<
{ [0 bytes data]
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Connection #0 to host xxxxxxx.cloud.databricks.com left intact

But there is no way to communicate correctly with the remote system.

$ databricks current-user me $DEBUG
12:06:20 INFO start pid=310 version=0.214.0 args="databricks, current-user, me, --debug"
Error: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST
12:06:20 ERROR failed execution pid=310 exit_code=1 error="default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST" 

Best of all, testing the whole procedure outside the GitLab pipeline works like a charm. Is it crazy, isn't it?

Well, we solved the problem by setting the DATABRICKS_TOKEN and stopping initializing the Databricks-cli the way we did.

 


 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.