02-20-2024 09:18 AM - edited 02-20-2024 09:19 AM
Hola all, I'm experiencing a quite strange error.
The problem is that and happens inside a GITLAB pipeline:
$ databricks current-user me
Error: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://XXXXXXXXX.databricks.com. Env: DATABRICKS_HOST
The curious fact is that the same command works fine within a local docker image.
The Token used is a personal token, and it is correctly propagated into the pipeline.
Did anybody else get the same problem? Any clue is welcomed
02-21-2024 05:53 AM
Hola Kaniz,
the problem is not on Databricks CLI but is due to some interactions happening inside the Gitlab pipeline.
According to the documentation reported here:
Databricks personal access token authentication | Databricks on AWS ( at the bottom of the page )
Setting the ENV vars: DATABRICKS_HOST and DATABRICKS_TOKEN is just ONE possible option. Well, on the Gitlab pipeline I was working it was the ONLY option.
It has been a hard debug that concluded with a lucky attempt from my colleague. Really a random & lucky attempt.
It is hard to spot the problem, so let me share my experience with others in the hope they will find that useful and time-saving.
> My pipeline script initializes the databricks-cli by passing
> the personal token and by setting the DATABRICKS_HOST env var.
$ if [ -z $TF_VAR_personal_token ]; then echo "Zero size TF_VAR_personal_token"; else echo $TF_VAR_personal_token | databricks configure $DEBUG; fi
12:06:19 INFO start pid=292 version=0.214.0 args="databricks, configure, --debug"
12:06:19 INFO Saving /root/.databrickscfg pid=292
12:06:19 INFO completed execution pid=292 exit_code=0
> And the procedure completes with no error.
$ echo "DATABRICKS_HOST=$DATABRICKS_HOST, DATABRICKS_PATH=$DATABRICKS_PATH, token md5 will follow"
DATABRICKS_HOST=https://xxxxxxx.cloud.databricks.com, DATABRICKS_PATH=/Volumes/main/default/datalake, token md5 will follow
$ echo $TF_VAR_personal_token | md5sum
2b7425a1a31e169cce7e149de25635f8 -
$ id
uid=0(root) gid=0(root) groups=0(root)
> The configuration file databrickscfg is correctly written with the passed information. Anything is working as expected.
$ cat /root/.databrickscfg | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
[DEFAULT]
host = https://xxxxxxx.cloud.databricks.com
token = xxxxxxxxxxxxx
$ databricks auth env | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
{
"env": {
"DATABRICKS_AUTH_TYPE": "pat",
"DATABRICKS_CONFIG_PROFILE": "DEFAULT",
"DATABRICKS_HOST": "https://xxxxxxx.cloud.databricks.com",
"DATABRICKS_TOKEN": "xxxxxxxxxxxxx"
}}
> Accessing the workspace is not a problem. Anything works so far.
$ curl -v -v -v 'https://xxxxxxx.cloud.databricks.com'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying xxx.xxx.xxx.xxx...
* Connected to xxxxxxx.cloud.databricks.com (xxx.xxx.xxx.xxx) port 443 (#0)
* ALPN: offers h2,http/1.1
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [88 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [155 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [15 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [4014 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server accepted h2
* Server certificate:
* subject: C=US; ST=California; L=San Francisco; O=Databricks Inc.; CN=*.cloud.databricks.com
* start date: Jan 8 00:00:00 2024 GMT
* expire date: Jan 6 23:59:59 2025 GMT
* subjectAltName: host "xxxxxxx.cloud.databricks.com" matched cert's "*.cloud.databricks.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
* SSL certificate verify ok.
} [5 bytes data]
* using HTTP/2
* h2h3 [:method: GET]
* h2h3 [:path: /]
* h2h3 [:scheme: https]
* h2h3 [:authority:xxxxxxx.cloud.databricks.com]
* h2h3 [user-agent: curl/7.88.1]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x55fe0bbc1c80)
} [5 bytes data]
> GET / HTTP/2
> Host: xxxxxxx.cloud.databricks.com
> user-agent: curl/7.88.1
> accept: */*
>
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* old SSL session ID is stale, removing
{ [5 bytes data]
< HTTP/2 303
< location: https://xxxxxxx.cloud.databricks.com/login.html
< vary: Accept-Encoding
< date: Wed, 21 Feb 2024 12:06:19 GMT
< server: databricks
<
{ [0 bytes data]
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Connection #0 to host xxxxxxx.cloud.databricks.com left intact
But there is no way to communicate correctly with the remote system.
$ databricks current-user me $DEBUG
12:06:20 INFO start pid=310 version=0.214.0 args="databricks, current-user, me, --debug"
Error: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST
12:06:20 ERROR failed execution pid=310 exit_code=1 error="default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST"
Best of all, testing the whole procedure outside the GitLab pipeline works like a charm. Is it crazy, isn't it?
Well, we solved the problem by setting the DATABRICKS_TOKEN and stopping initializing the Databricks-cli the way we did.
02-21-2024 05:53 AM
Hola Kaniz,
the problem is not on Databricks CLI but is due to some interactions happening inside the Gitlab pipeline.
According to the documentation reported here:
Databricks personal access token authentication | Databricks on AWS ( at the bottom of the page )
Setting the ENV vars: DATABRICKS_HOST and DATABRICKS_TOKEN is just ONE possible option. Well, on the Gitlab pipeline I was working it was the ONLY option.
It has been a hard debug that concluded with a lucky attempt from my colleague. Really a random & lucky attempt.
It is hard to spot the problem, so let me share my experience with others in the hope they will find that useful and time-saving.
> My pipeline script initializes the databricks-cli by passing
> the personal token and by setting the DATABRICKS_HOST env var.
$ if [ -z $TF_VAR_personal_token ]; then echo "Zero size TF_VAR_personal_token"; else echo $TF_VAR_personal_token | databricks configure $DEBUG; fi
12:06:19 INFO start pid=292 version=0.214.0 args="databricks, configure, --debug"
12:06:19 INFO Saving /root/.databrickscfg pid=292
12:06:19 INFO completed execution pid=292 exit_code=0
> And the procedure completes with no error.
$ echo "DATABRICKS_HOST=$DATABRICKS_HOST, DATABRICKS_PATH=$DATABRICKS_PATH, token md5 will follow"
DATABRICKS_HOST=https://xxxxxxx.cloud.databricks.com, DATABRICKS_PATH=/Volumes/main/default/datalake, token md5 will follow
$ echo $TF_VAR_personal_token | md5sum
2b7425a1a31e169cce7e149de25635f8 -
$ id
uid=0(root) gid=0(root) groups=0(root)
> The configuration file databrickscfg is correctly written with the passed information. Anything is working as expected.
$ cat /root/.databrickscfg | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
[DEFAULT]
host = https://xxxxxxx.cloud.databricks.com
token = xxxxxxxxxxxxx
$ databricks auth env | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
{
"env": {
"DATABRICKS_AUTH_TYPE": "pat",
"DATABRICKS_CONFIG_PROFILE": "DEFAULT",
"DATABRICKS_HOST": "https://xxxxxxx.cloud.databricks.com",
"DATABRICKS_TOKEN": "xxxxxxxxxxxxx"
}}
> Accessing the workspace is not a problem. Anything works so far.
$ curl -v -v -v 'https://xxxxxxx.cloud.databricks.com'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying xxx.xxx.xxx.xxx...
* Connected to xxxxxxx.cloud.databricks.com (xxx.xxx.xxx.xxx) port 443 (#0)
* ALPN: offers h2,http/1.1
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [88 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [155 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [15 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [4014 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server accepted h2
* Server certificate:
* subject: C=US; ST=California; L=San Francisco; O=Databricks Inc.; CN=*.cloud.databricks.com
* start date: Jan 8 00:00:00 2024 GMT
* expire date: Jan 6 23:59:59 2025 GMT
* subjectAltName: host "xxxxxxx.cloud.databricks.com" matched cert's "*.cloud.databricks.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
* SSL certificate verify ok.
} [5 bytes data]
* using HTTP/2
* h2h3 [:method: GET]
* h2h3 [:path: /]
* h2h3 [:scheme: https]
* h2h3 [:authority:xxxxxxx.cloud.databricks.com]
* h2h3 [user-agent: curl/7.88.1]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x55fe0bbc1c80)
} [5 bytes data]
> GET / HTTP/2
> Host: xxxxxxx.cloud.databricks.com
> user-agent: curl/7.88.1
> accept: */*
>
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* old SSL session ID is stale, removing
{ [5 bytes data]
< HTTP/2 303
< location: https://xxxxxxx.cloud.databricks.com/login.html
< vary: Accept-Encoding
< date: Wed, 21 Feb 2024 12:06:19 GMT
< server: databricks
<
{ [0 bytes data]
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Connection #0 to host xxxxxxx.cloud.databricks.com left intact
But there is no way to communicate correctly with the remote system.
$ databricks current-user me $DEBUG
12:06:20 INFO start pid=310 version=0.214.0 args="databricks, current-user, me, --debug"
Error: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST
12:06:20 ERROR failed execution pid=310 exit_code=1 error="default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST"
Best of all, testing the whole procedure outside the GitLab pipeline works like a charm. Is it crazy, isn't it?
Well, we solved the problem by setting the DATABRICKS_TOKEN and stopping initializing the Databricks-cli the way we did.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group