- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-20-2024 09:18 AM - edited 02-20-2024 09:19 AM
Hola all, I'm experiencing a quite strange error.
The problem is that and happens inside a GITLAB pipeline:
$ databricks current-user me
Error: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://XXXXXXXXX.databricks.com. Env: DATABRICKS_HOST
The curious fact is that the same command works fine within a local docker image.
The Token used is a personal token, and it is correctly propagated into the pipeline.
Did anybody else get the same problem? Any clue is welcomed
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-21-2024 05:53 AM
Hola Kaniz,
the problem is not on Databricks CLI but is due to some interactions happening inside the Gitlab pipeline.
According to the documentation reported here:
Databricks personal access token authentication | Databricks on AWS ( at the bottom of the page )
Setting the ENV vars: DATABRICKS_HOST and DATABRICKS_TOKEN is just ONE possible option. Well, on the Gitlab pipeline I was working it was the ONLY option.
It has been a hard debug that concluded with a lucky attempt from my colleague. Really a random & lucky attempt.
It is hard to spot the problem, so let me share my experience with others in the hope they will find that useful and time-saving.
> My pipeline script initializes the databricks-cli by passing
> the personal token and by setting the DATABRICKS_HOST env var.
$ if [ -z $TF_VAR_personal_token ]; then echo "Zero size TF_VAR_personal_token"; else echo $TF_VAR_personal_token | databricks configure $DEBUG; fi
12:06:19 INFO start pid=292 version=0.214.0 args="databricks, configure, --debug"
12:06:19 INFO Saving /root/.databrickscfg pid=292
12:06:19 INFO completed execution pid=292 exit_code=0
> And the procedure completes with no error.
$ echo "DATABRICKS_HOST=$DATABRICKS_HOST, DATABRICKS_PATH=$DATABRICKS_PATH, token md5 will follow"
DATABRICKS_HOST=https://xxxxxxx.cloud.databricks.com, DATABRICKS_PATH=/Volumes/main/default/datalake, token md5 will follow
$ echo $TF_VAR_personal_token | md5sum
2b7425a1a31e169cce7e149de25635f8 -
$ id
uid=0(root) gid=0(root) groups=0(root)
> The configuration file databrickscfg is correctly written with the passed information. Anything is working as expected.
$ cat /root/.databrickscfg | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
[DEFAULT]
host = https://xxxxxxx.cloud.databricks.com
token = xxxxxxxxxxxxx
$ databricks auth env | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
{
"env": {
"DATABRICKS_AUTH_TYPE": "pat",
"DATABRICKS_CONFIG_PROFILE": "DEFAULT",
"DATABRICKS_HOST": "https://xxxxxxx.cloud.databricks.com",
"DATABRICKS_TOKEN": "xxxxxxxxxxxxx"
}}
> Accessing the workspace is not a problem. Anything works so far.
$ curl -v -v -v 'https://xxxxxxx.cloud.databricks.com'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying xxx.xxx.xxx.xxx...
* Connected to xxxxxxx.cloud.databricks.com (xxx.xxx.xxx.xxx) port 443 (#0)
* ALPN: offers h2,http/1.1
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [88 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [155 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [15 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [4014 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server accepted h2
* Server certificate:
* subject: C=US; ST=California; L=San Francisco; O=Databricks Inc.; CN=*.cloud.databricks.com
* start date: Jan 8 00:00:00 2024 GMT
* expire date: Jan 6 23:59:59 2025 GMT
* subjectAltName: host "xxxxxxx.cloud.databricks.com" matched cert's "*.cloud.databricks.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
* SSL certificate verify ok.
} [5 bytes data]
* using HTTP/2
* h2h3 [:method: GET]
* h2h3 [:path: /]
* h2h3 [:scheme: https]
* h2h3 [:authority:xxxxxxx.cloud.databricks.com]
* h2h3 [user-agent: curl/7.88.1]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x55fe0bbc1c80)
} [5 bytes data]
> GET / HTTP/2
> Host: xxxxxxx.cloud.databricks.com
> user-agent: curl/7.88.1
> accept: */*
>
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* old SSL session ID is stale, removing
{ [5 bytes data]
< HTTP/2 303
< location: https://xxxxxxx.cloud.databricks.com/login.html
< vary: Accept-Encoding
< date: Wed, 21 Feb 2024 12:06:19 GMT
< server: databricks
<
{ [0 bytes data]
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Connection #0 to host xxxxxxx.cloud.databricks.com left intact
But there is no way to communicate correctly with the remote system.
$ databricks current-user me $DEBUG
12:06:20 INFO start pid=310 version=0.214.0 args="databricks, current-user, me, --debug"
Error: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST
12:06:20 ERROR failed execution pid=310 exit_code=1 error="default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST"
Best of all, testing the whole procedure outside the GitLab pipeline works like a charm. Is it crazy, isn't it?
Well, we solved the problem by setting the DATABRICKS_TOKEN and stopping initializing the Databricks-cli the way we did.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-21-2024 05:53 AM
Hola Kaniz,
the problem is not on Databricks CLI but is due to some interactions happening inside the Gitlab pipeline.
According to the documentation reported here:
Databricks personal access token authentication | Databricks on AWS ( at the bottom of the page )
Setting the ENV vars: DATABRICKS_HOST and DATABRICKS_TOKEN is just ONE possible option. Well, on the Gitlab pipeline I was working it was the ONLY option.
It has been a hard debug that concluded with a lucky attempt from my colleague. Really a random & lucky attempt.
It is hard to spot the problem, so let me share my experience with others in the hope they will find that useful and time-saving.
> My pipeline script initializes the databricks-cli by passing
> the personal token and by setting the DATABRICKS_HOST env var.
$ if [ -z $TF_VAR_personal_token ]; then echo "Zero size TF_VAR_personal_token"; else echo $TF_VAR_personal_token | databricks configure $DEBUG; fi
12:06:19 INFO start pid=292 version=0.214.0 args="databricks, configure, --debug"
12:06:19 INFO Saving /root/.databrickscfg pid=292
12:06:19 INFO completed execution pid=292 exit_code=0
> And the procedure completes with no error.
$ echo "DATABRICKS_HOST=$DATABRICKS_HOST, DATABRICKS_PATH=$DATABRICKS_PATH, token md5 will follow"
DATABRICKS_HOST=https://xxxxxxx.cloud.databricks.com, DATABRICKS_PATH=/Volumes/main/default/datalake, token md5 will follow
$ echo $TF_VAR_personal_token | md5sum
2b7425a1a31e169cce7e149de25635f8 -
$ id
uid=0(root) gid=0(root) groups=0(root)
> The configuration file databrickscfg is correctly written with the passed information. Anything is working as expected.
$ cat /root/.databrickscfg | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
[DEFAULT]
host = https://xxxxxxx.cloud.databricks.com
token = xxxxxxxxxxxxx
$ databricks auth env | sed s/$TF_VAR_personal_token/xxxxxxxxxxxxx/g
{
"env": {
"DATABRICKS_AUTH_TYPE": "pat",
"DATABRICKS_CONFIG_PROFILE": "DEFAULT",
"DATABRICKS_HOST": "https://xxxxxxx.cloud.databricks.com",
"DATABRICKS_TOKEN": "xxxxxxxxxxxxx"
}}
> Accessing the workspace is not a problem. Anything works so far.
$ curl -v -v -v 'https://xxxxxxx.cloud.databricks.com'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying xxx.xxx.xxx.xxx...
* Connected to xxxxxxx.cloud.databricks.com (xxx.xxx.xxx.xxx) port 443 (#0)
* ALPN: offers h2,http/1.1
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [88 bytes data]
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [155 bytes data]
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
{ [15 bytes data]
* TLSv1.3 (IN), TLS handshake, Certificate (11):
{ [4014 bytes data]
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
{ [264 bytes data]
* TLSv1.3 (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* TLSv1.3 (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server accepted h2
* Server certificate:
* subject: C=US; ST=California; L=San Francisco; O=Databricks Inc.; CN=*.cloud.databricks.com
* start date: Jan 8 00:00:00 2024 GMT
* expire date: Jan 6 23:59:59 2025 GMT
* subjectAltName: host "xxxxxxx.cloud.databricks.com" matched cert's "*.cloud.databricks.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
* SSL certificate verify ok.
} [5 bytes data]
* using HTTP/2
* h2h3 [:method: GET]
* h2h3 [:path: /]
* h2h3 [:scheme: https]
* h2h3 [:authority:xxxxxxx.cloud.databricks.com]
* h2h3 [user-agent: curl/7.88.1]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x55fe0bbc1c80)
} [5 bytes data]
> GET / HTTP/2
> Host: xxxxxxx.cloud.databricks.com
> user-agent: curl/7.88.1
> accept: */*
>
{ [5 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
{ [230 bytes data]
* old SSL session ID is stale, removing
{ [5 bytes data]
< HTTP/2 303
< location: https://xxxxxxx.cloud.databricks.com/login.html
< vary: Accept-Encoding
< date: Wed, 21 Feb 2024 12:06:19 GMT
< server: databricks
<
{ [0 bytes data]
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
* Connection #0 to host xxxxxxx.cloud.databricks.com left intact
But there is no way to communicate correctly with the remote system.
$ databricks current-user me $DEBUG
12:06:20 INFO start pid=310 version=0.214.0 args="databricks, current-user, me, --debug"
Error: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST
12:06:20 ERROR failed execution pid=310 exit_code=1 error="default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method. Config: host=https://xxxxxxxcloud.databricks.com. Env: DATABRICKS_HOST"
Best of all, testing the whole procedure outside the GitLab pipeline works like a charm. Is it crazy, isn't it?
Well, we solved the problem by setting the DATABRICKS_TOKEN and stopping initializing the Databricks-cli the way we did.

