cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

MLFlow Authentication from Databricks App for GenAI Tracing

pemidexx
New Contributor III

I am working on a Dash-based app that includes a call to a Databricks-hosted LLM endpoint. I am trying to track those calls with MLFlow. My code is (roughly) like this:

from openai import OpenAI
import mlflow

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/my-name@mycompany.com/my-experiment")
mlflow.openai.autolog()

client = OpenAI()

This works as expected when testing the app locally, but results in an authentication error when deployed to Databricks apps. Specifically, if I turn on debug logging I see this error before the app crashes:

DEBUG:urllib3.connectionpool:https://[MY WORKSPACE].databricks.com:443 "GET /api/2.0/mlflow/experiments/get-by-name?experiment_name=%2FUsers%2Fmy-name%40mycompany.com%2Fmy-experiment HTTP/1.1" 401 144

I have attempted the following (unsuccessfully):

  1. Manually setting the tracking URI to the specific URI for my workspace.
  2. Setting the DATABRICKS_TOKEN environment variable to DATABRICKS_CLIENT_SECRET from the app's environment.
  3. Setting the DATABRICKS_TOKEN to the value of a PAT I generated for the app.

What is the correct way to capture GenAI traces with MLFlow from within a Databricks App deployment?

1 ACCEPTED SOLUTION

Accepted Solutions

pemidexx
New Contributor III

Thank you for the push in the right direction! I was able to solve the issue with this code

os.environ["DATABRICKS_CLIENT_ID"] = ""
os.environ["DATABRICKS_CLIENT_SECRET"] = ""
os.environ["DATABRICKS_TOKEN"] = os.environ.get("VAR_CONFIGURED_WITH_DATABRICKS_SECRETS")

# Enables trace logging by default
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/my-name@mycompany.com/my-experiment")
mlflow.openai.autolog()

Note that I did not need to set DATABRICKS_HOST, as that's already set in the App's default environment.

Thanks @samshifflett46 !

View solution in original post

4 REPLIES 4

samshifflett46
New Contributor III

Hey @pemidexx, this may be a dumb question, but have you set your DATABRICKS_HOST env variable?

os.environ["DATABRICKS_HOST"] = "https://dbc-1234567890123456.cloud.databricks.com" # set to your server URI
os.environ["DATABRICKS_TOKEN"] = "dapixxxxxxxxxxxxx"

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/your-experiment")

pemidexx
New Contributor III

Good question - unfortunately yes, if I set DATABRICKS_TOKEN I get a different error

2025/05/01 13:27:40 DEBUG mlflow.utils.databricks_utils: Failed to create databricks SDK workspace client, error: ValueError('validate: more than one authorization method configured: oauth and pat. Config: host=https://myworkspace.cloud.databricks.com, REDACTED_SECRET client_id=MYCLIENTID, client_REDACTED_SECRET warehouse_id=MYWAREHOUSE. Env: DATABRICKS_HOST, DATABRICKS_TOKEN, DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET, DATABRICKS_WAREHOUSE_ID')

samshifflett46
New Contributor III

Ah I've seen this issue many times. The databricks sdk here is trying to authenticate with Databricks API but the environment variables are set for multiple types of authentication. If you remove the 

DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET,

environment variables then I think you should get past that error. Let me know if there are any other issues after that fix is attempted!

pemidexx
New Contributor III

Thank you for the push in the right direction! I was able to solve the issue with this code

os.environ["DATABRICKS_CLIENT_ID"] = ""
os.environ["DATABRICKS_CLIENT_SECRET"] = ""
os.environ["DATABRICKS_TOKEN"] = os.environ.get("VAR_CONFIGURED_WITH_DATABRICKS_SECRETS")

# Enables trace logging by default
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/my-name@mycompany.com/my-experiment")
mlflow.openai.autolog()

Note that I did not need to set DATABRICKS_HOST, as that's already set in the App's default environment.

Thanks @samshifflett46 !

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now