Databricks Community

pemidexx · ‎04-29-2025

I am working on a Dash-based app that includes a call to a Databricks-hosted LLM endpoint. I am trying to track those calls with MLFlow. My code is (roughly) like this:

from openai import OpenAI
import mlflow

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/my-name@mycompany.com/my-experiment")
mlflow.openai.autolog()

client = OpenAI()

This works as expected when testing the app locally, but results in an authentication error when deployed to Databricks apps. Specifically, if I turn on debug logging I see this error before the app crashes:

DEBUG:urllib3.connectionpool:https://[MY WORKSPACE].databricks.com:443 "GET /api/2.0/mlflow/experiments/get-by-name?experiment_name=%2FUsers%2Fmy-name%40mycompany.com%2Fmy-experiment HTTP/1.1" 401 144

I have attempted the following (unsuccessfully):

Manually setting the tracking URI to the specific URI for my workspace.
Setting the DATABRICKS_TOKEN environment variable to DATABRICKS_CLIENT_SECRET from the app's environment.
Setting the DATABRICKS_TOKEN to the value of a PAT I generated for the app.

What is the correct way to capture GenAI traces with MLFlow from within a Databricks App deployment?

pemidexx · ‎05-01-2025

Thank you for the push in the right direction! I was able to solve the issue with this code

os.environ["DATABRICKS_CLIENT_ID"] = ""
os.environ["DATABRICKS_CLIENT_SECRET"] = ""
os.environ["DATABRICKS_TOKEN"] = os.environ.get("VAR_CONFIGURED_WITH_DATABRICKS_SECRETS")

# Enables trace logging by default
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/my-name@mycompany.com/my-experiment")
mlflow.openai.autolog()

Note that I did not need to set DATABRICKS_HOST, as that's already set in the App's default environment.

Thanks @samshifflett46 !

View solution in original post

samshifflett46 · ‎04-29-2025

Hey @pemidexx, this may be a dumb question, but have you set your DATABRICKS_HOST env variable?

os.environ["DATABRICKS_HOST"] = "https://dbc-1234567890123456.cloud.databricks.com" # set to your server URI
os.environ["DATABRICKS_TOKEN"] = "dapixxxxxxxxxxxxx"

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/your-experiment")

pemidexx · ‎05-01-2025

Good question - unfortunately yes, if I set DATABRICKS_TOKEN I get a different error

2025/05/01 13:27:40 DEBUG mlflow.utils.databricks_utils: Failed to create databricks SDK workspace client, error: ValueError('validate: more than one authorization method configured: oauth and pat. Config: host=https://myworkspace.cloud.databricks.com, REDACTED_SECRET client_id=MYCLIENTID, client_REDACTED_SECRET warehouse_id=MYWAREHOUSE. Env: DATABRICKS_HOST, DATABRICKS_TOKEN, DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET, DATABRICKS_WAREHOUSE_ID')

samshifflett46 · ‎05-01-2025

Ah I've seen this issue many times. The databricks sdk here is trying to authenticate with Databricks API but the environment variables are set for multiple types of authentication. If you remove the

DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET,

environment variables then I think you should get past that error. Let me know if there are any other issues after that fix is attempted!

pemidexx · ‎05-01-2025

Thank you for the push in the right direction! I was able to solve the issue with this code

os.environ["DATABRICKS_CLIENT_ID"] = ""
os.environ["DATABRICKS_CLIENT_SECRET"] = ""
os.environ["DATABRICKS_TOKEN"] = os.environ.get("VAR_CONFIGURED_WITH_DATABRICKS_SECRETS")

# Enables trace logging by default
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/my-name@mycompany.com/my-experiment")
mlflow.openai.autolog()

Note that I did not need to set DATABRICKS_HOST, as that's already set in the App's default environment.

Thanks @samshifflett46 !