โ02-11-2026 03:53 AM - edited โ02-11-2026 03:56 AM
I have setup an asset bundle that deploys a job, with a job cluster within a pool, utilising a Docker image.
The docker image pulls and the dependencies install which was as battle in itself, but now I'm struggling to get a Databricks WorkspaceClient to authenticate in the actual job code.
The code worked seamlessly in Databricks serverless compute, but it fails when inside the Docker container
What is the recomended way to pass credentials through to a docker container?
A couple of ways I can think of which are not secure at all;
- pass credentials through as job parameters and include in instantiation of the WorkspaceClient
- embed the credentials in the Docker image at build time
Any advice very much appreciated, thanks community!
dbutils = WorkspaceClient().dbutils # fails with the error belowThe error stacktrace:
ValueError: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
File /databricks/python/lib/python3.12/site-packages/databricks/sdk/config.py:510, in Config.init_auth(self)
509 try:
--> 510 self._header_factory = self._credentials_strategy(self)
511 self.auth_type = self._credentials_strategy.auth_type()
File /databricks/python/lib/python3.12/site-packages/databricks/sdk/credentials_provider.py:969, in DefaultCredentials.__call__(self, cfg)
968 auth_flow_url = "https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication"
--> 969 raise ValueError(
970 f"cannot configure default credentials, please check {auth_flow_url} to configure credentials for your preferred authentication method."
971 )
ValueError: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method.
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
File /databricks/python/lib/python3.12/site-packages/databricks/sdk/config.py:186, in Config.__init__(self, credentials_provider, credentials_strategy, product, product_version, clock, **kwargs)
185 self._validate()
--> 186 self.init_auth()
187 self._init_product(product, product_version)
File /databricks/python/lib/python3.12/site-packages/databricks/sdk/config.py:515, in Config.init_auth(self)
514 except ValueError as e:
--> 515 raise ValueError(f"{self._credentials_strategy.auth_type()} auth: {e}") from e
ValueError: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method.
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
File ~/.ipykernel/2380/command--1-131273890:18
15 entry = [ep for ep in metadata.distribution("comet_job").entry_points if ep.name == "main"]
16 if entry:
17 # Load and execute the entrypoint, assumes no parameters
---> 18 entry[0].load()()
19 else:
20 import importlib
File /databricks/python/lib/python3.12/site-packages/importlib_metadata/__init__.py:210, in EntryPoint.load(self)
205 """Load the entry point from its definition. If only a module
206 is indicated by the value, return that module. Otherwise,
207 return the named object.
208 """
209 match = self.pattern.match(self.value)
--> 210 module = import_module(match.group('module'))
211 attrs = filter(None, (match.group('attr') or '').split('.'))
212 return functools.reduce(getattr, attrs, module)
File /usr/lib/python3.12/importlib/__init__.py:90, in import_module(name, package)
88 break
89 level += 1
---> 90 return _bootstrap._gcd_import(name[level:], package, level)
File <frozen importlib._bootstrap>:1387, in _gcd_import(name, package, level)
File <frozen importlib._bootstrap>:1360, in _find_and_load(name, import_)
File <frozen importlib._bootstrap>:1331, in _find_and_load_unlocked(name, import_)
File <frozen importlib._bootstrap>:935, in _load_unlocked(spec)
File <frozen importlib._bootstrap_external>:995, in exec_module(self, module)
File <frozen importlib._bootstrap>:488, in _call_with_frames_removed(f, *args, **kwds)
File /usr/local/lib/python3.12/dist-packages/comet_job/main.py:13
11 from comet.comet import Comet
12 from comet.config import CometConfig
---> 13 from comet_job.event import FinaliserMessage, JobStatus, publish_to_sqs
14 from comet_job.params import parse_args
15 from comet_job.spark import SparkCSVReader
File /usr/local/lib/python3.12/dist-packages/comet_job/event.py:8
5 from databricks.sdk import WorkspaceClient
6 from pydantic import BaseModel, Field
----> 8 dbutils = WorkspaceClient().dbutils
11 class JobStatus(str, Enum):
12 STARTED = "STARTED"
File /databricks/python/lib/python3.12/site-packages/databricks/sdk/__init__.py:174, in WorkspaceClient.__init__(self, host, account_id, username, password, client_id, client_secret, token, profile, config_file, azure_workspace_resource_id, azure_client_secret, azure_client_id, azure_tenant_id, azure_environment, auth_type, cluster_id, google_credentials, google_service_account, debug_truncate_bytes, debug_headers, product, product_version, credentials_strategy, credentials_provider, config)
144 def __init__(
145 self,
146 *,
(...)
171 config: Optional[client.Config] = None,
172 โ๐โ
173 if not config:
--> 174 config = client.Config(
175 host=host,
176 account_id=account_id,
177 username=username,
178 password=password,
179 client_id=client_id,
180 client_secret=client_secret,
181 token=token,
182 profile=profile,
183 config_file=config_file,
184 azure_workspace_resource_id=azure_workspace_resource_id,
185 azure_client_secret=azure_client_secret,
186 azure_client_id=azure_client_id,
187 azure_tenant_id=azure_tenant_id,
188 azure_environment=azure_environment,
189 auth_type=auth_type,
190 cluster_id=cluster_id,
191 google_credentials=google_credentials,
192 google_service_account=google_service_account,
193 credentials_strategy=credentials_strategy,
194 credentials_provider=credentials_provider,
195 debug_truncate_bytes=debug_truncate_bytes,
196 debug_headers=debug_headers,
197 product=product,
198 product_version=product_version,
199 )
200 self._config = config.copy()
201 self._dbutils = _make_dbutils(self._config)
File /databricks/python/lib/python3.12/site-packages/databricks/sdk/config.py:190, in Config.__init__(self, credentials_provider, credentials_strategy, product, product_version, clock, **kwargs)
188 except ValueError as e:
189 message = self.wrap_debug_info(str(e))
--> 190 raise ValueError(message) from e
ValueError: default auth: cannot configure default credentials, please check https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication to configure credentials for your preferred authentication method.
Workload failed, see run output for detailsDockerfile:
Just using the base image as a POC:
FROM databricksruntime/python:17.3-LTS
Asset config:
```
bundle:
name: my_bundle
uuid: d69231e9-224d-4052-bd71-0673ada2bd23
include:
- resources/*.yml
- resources/*/*.yml
variables:
# ...omitted for brevity...
targets:
dev-local:
variables:
environment: dev
aws_region: ap-southeast-2
input_path: s3://XXXXXXXXXXXXXXXXXXXX
output_path: s3://XXXXXXXXXXXXXXXXXXXX
mode: development
default: true
permissions:
- group_name: devs
level: CAN_RUN
- group_name: devs
level: CAN_MANAGE
- group_name: automations
level: CAN_RUN
- service_principal_name: ${var.terraform_ci_app_id}
level: CAN_MANAGE
resources:
clusters:
my_cluster:
instance_pool_id: XXXXXXXXXXX-rosin6-pool-XXXXXXXX
docker_image:
url: "registry.gitlab.com/my/image:latest"
basic_auth:
username: "${var.gitlab_username}"
password: "${var.gitlab_token}"
runtime_engine: "PHOTON"
is_single_node: true
kind: CLASSIC_PREVIEW
node_type_id: "m5d.large"
driver_node_type_id: "m5d.large"
spark_version: "17.3.x-scala2.13"
jobs:
my_job:
name: my_job
parameters:
- name: job_id
default: "undefined"
- name: job_config
default: "{}"
- name: environment
default: ${var.environment}
- name: run_id
default: "{{job.run_id}}"
tasks:
- task_key: main_task
existing_cluster_id: ${resources.clusters.my_cluster.id}
max_retries: 0 # No retries by default.
disable_auto_optimization: true # Disable auto-optimization for this task - stop retries from being added automatically.
# libraries:
# - whl: ../dist/*.whl
python_wheel_task:
package_name: my_job
entry_point: mainโ02-17-2026 11:10 AM
Hello @fleetwoodmatt
Thanks for the detailed context.
Recommended Approach: Cluster-Scoped Service Principal via Spark Config
P.S.: Ensure Cluster-Scoped Service Principal has necessary workspace permissions: CAN_USE compute, permissions for Job actually needs, and READ on Secret scope
Why Serverless Worked But Docker Doesn't
The spark_env_vars bridge is exactly what fills this gap - Databricks does apply those to the cluster's process environment even inside Docker, because they're set at cluster start before your container's entry point runs.
โ02-17-2026 11:10 AM
Hello @fleetwoodmatt
Thanks for the detailed context.
Recommended Approach: Cluster-Scoped Service Principal via Spark Config
P.S.: Ensure Cluster-Scoped Service Principal has necessary workspace permissions: CAN_USE compute, permissions for Job actually needs, and READ on Secret scope
Why Serverless Worked But Docker Doesn't
The spark_env_vars bridge is exactly what fills this gap - Databricks does apply those to the cluster's process environment even inside Docker, because they're set at cluster start before your container's entry point runs.