Hey DinoSaluzzi - Thanks for reaching out!
The error message you're seeing—ValueError: default auth: runtime: default auth: cannot configure default credentials
—reflects a stricter enforcement in how authentication happens within Spark worker nodes running pandas UDFs on Databricks. This is not an isolated issue but a byproduct of growing security and compliance standards for accessing Databricks-hosted foundation model endpoints.
Why This Error Occurs
-
When you invoke a Databricks foundational model API (such as through the ChatDatabricks wrapper) inside a pandas UDF, the code gets executed on separate Spark worker nodes—not on the main driver process.
-
These worker nodes do not inherit your interactive (user) session credentials or SSO context by default. Any authentication previously accessible due to implicit session passing, workspace defaults, or inherited environment variables is not reliably available to workers.
-
Your workflow may have functioned in the past because of less strict authentication checks or side effects in how worker environments were initialized. Recent updates to authentication libraries or Databricks runtime safety controls now enforce that every API call from a worker node must have explicit, valid credentials provided at call time.
-
As a result, code that doesn't supply a credential explicitly to the API call from within each worker will now fail with errors like the one you attached.
Recommended Approach: How to Properly Authenticate in Worker Nodes
To comply with Databricks' security best practices and ensure pandas UDFs (i.e., the worker nodes) can call foundational model endpoints:
-
Never rely on implicit session credentials inside UDFs. Every worker acts as a clean process without your user/session context.
-
Use a Machine-to-Machine OAuth Token (Preferred for Production):
- Register a service principal with appropriate permissions.
- Store OAuth credentials (client ID/secret) securely in a Databricks secret scope.
- Distribute/obtain a fresh OAuth token inside workers, typically by accessing secrets either before the UDF or within it.
-
Alternatively, Pass a Service Principal PAT Token (For Dev/Testing):
- Create a PAT token for a service principal (not a personal/user PAT).
- Store it in a Databricks secret scope.
- Retrieve the token inside your pandas UDF (or broadcast it to all workers).
token = dbutils.secrets.get(scope="my-secret-scope", key="FMAPI_TOKEN")
def my_udf(...):
headers = {"Authorization": f"Bearer {token}"}
-
Set Environment Variables or Use Spark Broadcasts (Optional):
- If environment variables are required for a library to detect auth, set them programmatically in each worker process within your UDF.
-
Never embed credentials in code or notebooks. Use Databricks secret scopes or environment configuration for sensitive information.
Here are some resources (switch clouds with dropdown on top right):
https://docs.databricks.com/aws/en/machine-learning/model-serving/score-foundation-models
https://docs.databricks.com/aws/en/machine-learning/model-serving/score-custom-model-endpoints
https://docs.databricks.com/aws/en/admin/users-groups/best-practices