04-29-2025 01:49 PM
I am working on a Dash-based app that includes a call to a Databricks-hosted LLM endpoint. I am trying to track those calls with MLFlow. My code is (roughly) like this:
from openai import OpenAI
import mlflow
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/my-name@mycompany.com/my-experiment")
mlflow.openai.autolog()
client = OpenAI()
This works as expected when testing the app locally, but results in an authentication error when deployed to Databricks apps. Specifically, if I turn on debug logging I see this error before the app crashes:
DEBUG:urllib3.connectionpool:https://[MY WORKSPACE].databricks.com:443 "GET /api/2.0/mlflow/experiments/get-by-name?experiment_name=%2FUsers%2Fmy-name%40mycompany.com%2Fmy-experiment HTTP/1.1" 401 144
I have attempted the following (unsuccessfully):
What is the correct way to capture GenAI traces with MLFlow from within a Databricks App deployment?
05-01-2025 10:32 AM - edited 05-01-2025 10:33 AM
Thank you for the push in the right direction! I was able to solve the issue with this code
os.environ["DATABRICKS_CLIENT_ID"] = ""
os.environ["DATABRICKS_CLIENT_SECRET"] = ""
os.environ["DATABRICKS_TOKEN"] = os.environ.get("VAR_CONFIGURED_WITH_DATABRICKS_SECRETS")
# Enables trace logging by default
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/my-name@mycompany.com/my-experiment")
mlflow.openai.autolog()
Note that I did not need to set DATABRICKS_HOST, as that's already set in the App's default environment.
Thanks @samshifflett46 !
04-29-2025 05:27 PM
Hey @pemidexx, this may be a dumb question, but have you set your DATABRICKS_HOST env variable?
os.environ["DATABRICKS_HOST"] = "https://dbc-1234567890123456.cloud.databricks.com" # set to your server URI
os.environ["DATABRICKS_TOKEN"] = "dapixxxxxxxxxxxxx"
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/your-experiment")
05-01-2025 06:31 AM
Good question - unfortunately yes, if I set DATABRICKS_TOKEN I get a different error
2025/05/01 13:27:40 DEBUG mlflow.utils.databricks_utils: Failed to create databricks SDK workspace client, error: ValueError('validate: more than one authorization method configured: oauth and pat. Config: host=https://myworkspace.cloud.databricks.com, REDACTED_SECRET client_id=MYCLIENTID, client_REDACTED_SECRET warehouse_id=MYWAREHOUSE. Env: DATABRICKS_HOST, DATABRICKS_TOKEN, DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET, DATABRICKS_WAREHOUSE_ID')
05-01-2025 07:07 AM
Ah I've seen this issue many times. The databricks sdk here is trying to authenticate with Databricks API but the environment variables are set for multiple types of authentication. If you remove the
DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET,
environment variables then I think you should get past that error. Let me know if there are any other issues after that fix is attempted!
05-01-2025 10:32 AM - edited 05-01-2025 10:33 AM
Thank you for the push in the right direction! I was able to solve the issue with this code
os.environ["DATABRICKS_CLIENT_ID"] = ""
os.environ["DATABRICKS_CLIENT_SECRET"] = ""
os.environ["DATABRICKS_TOKEN"] = os.environ.get("VAR_CONFIGURED_WITH_DATABRICKS_SECRETS")
# Enables trace logging by default
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/my-name@mycompany.com/my-experiment")
mlflow.openai.autolog()
Note that I did not need to set DATABRICKS_HOST, as that's already set in the App's default environment.
Thanks @samshifflett46 !
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now