- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2025 08:31 AM
You are experiencing an authentication issue when trying to use a custom Docker image from Azure Container Registry (ACR) with Databricks clusters, despite successfully using admin tokens and service principals with acrpull permissions in other contexts like local CLI login. This is a common pain point due to specific requirements in Databricks for accessing private Docker registries like ACR.
Typical Causes and Checks
-
Service Principal Permissions: Make sure your service principal has the
acrpullrole at the scope of the ACR registry, and not only at a more limited resource group or subscription level. -
Authentication Methods:
-
Databricks typically only supports username/password authentication for accessing Docker registries, not using Docker CLI tokens or ACR access keys directly.
-
The username can be the Service Principal’s application ID, and the password must be the client secret created for that service principal, not any ACR-generated token.
-
-
Cluster Configuration: In Databricks, the registry credentials must be set via cluster configuration. These can be provided as secret references or directly, in the following format:
json"docker": { "image": "<registry-name>.azurecr.io/<image>:<tag>", "registryUrl": "https://<registry-name>.azurecr.io", "credentials": { "username": "<service-principal-app-id>", "password": "<service-principal-secret>" } } -
Networking Issues:
-
Ensure your Databricks workspace can access ACR (no NSG or firewall blockage on the registry’s endpoint, especially if ACR is set to private endpoint).
-
-
Format of Registry URL: The URL must be the full format:
https://<registry-name>.azurecr.io.
Steps to Diagnose and Resolve
-
Verify Service Principal Credentials
-
Double-check the Application (client) ID and client secret and that they are correctly configured in Databricks cluster spec.
-
-
Assign Proper Role to Service Principal
-
In Azure Portal, navigate to your Container Registry → Access Control (IAM), assign the service principal the “AcrPull” role at the registry level.
-
-
Configuration in Databricks
-
Place credentials in a Databricks secret scope for security.
-
Add snippet to cluster custom Docker settings, for example:
json"docker": { "image": "myregistry.azurecr.io/myimage:latest", "registryUrl": "https://myregistry.azurecr.io", "credentials": { "username": "{{secrets/my-scope/my-sp-client-id}}", "password": "{{secrets/my-scope/my-sp-client-secret}}" } }
-
-
Test Locally with Service Principal
-
Try logging in locally (outside Databricks) using the
docker logincommand with the service principal app id as username, and client secret as password:textdocker login myregistry.azurecr.io --username <sp-app-id> --password <sp-client-secret> -
If this fails, the service principal is not configured correctly.
-
-
Check Cluster Logs
-
Inspect cluster event logs for docker pull errors. Look for messages about authentication failures or unauthorized access.
-
Key Pitfalls
-
Using Admin User: Databricks does not support ACR Admin user credentials for cluster Docker pulls; only service principal is allowed.
-
Token vs Secret: Do not use ACR tokens or admin passwords—only service principal credentials are valid here.
Next Steps
If the above steps do not resolve the problem:
-
Double check that the service principal is not expired.
-
Make sure the registry URL is fully qualified (https, not just FQDN).
-
Validate the cluster network configuration for outbound access to Azure services.