workflow not pickingup correct host value (While working with MLflow model registry URI)

Dharma25
New Contributor III

Exception: mlflow.exceptions.MlflowException: An API request to https://canada.cloud.databricks.com/api/2.0/mlflow/model-versions/list-artifacts failed due to a timeout. The error message was: HTTPSConnectionPool(host='canada.cloud.databricks.com', port=443): Max retries exceeded with url: /api/2.0/mlflow/model-versions/list-artifacts.

Expected: The request should be directed to the current workspace URI instead of cloud.databricks.com.

I am encountering a strange issue while executing my code within a workflow. I am attempting to load an MLflow model registered in the Unity Catalog using mlflow.pyfunc.spark_udf(). I have ensured that the model URI and other parameters are correct. This code is part of a Python class.

Additionally, I am setting the registry URI using mlflow.set_registry_uri("databricks-uc"). Below are the environment variables I am configuring at the compute level:

DATABRICKS_HOST="<workspace_url>" DATABRICKS_TOKEN="<PAT>"

Spark configuration:

spark.executorEnv.DATABRICKS_HOST="<workspace_url>" spark.executorEnv.DATABRICKS_TOKEN="<PAT>"

Databricks runtime version: 14.3

lingareddy_Alva
Esteemed Contributor

@Dharma25 

The error clearly shows that MLflow requests are being incorrectly routed to "canada.cloud.databricks.com" instead of your actual workspace URL, causing the timeout.
This is a known issue that can occur with MLflow in Databricks, particularly when working with Unity Catalog models. Here's a more targeted approach to fix it:

1. First, ensure your environment variables are correctly set before any MLflow operations:
import os
import mlflow

# Explicitly define the workspace URL (don't rely on environment variables that might not propagate)
workspace_url = "https://your-actual-workspace-url.cloud.databricks.com" # Replace with your actual URL

# Set both tracking and registry URIs explicitly
os.environ["DATABRICKS_HOST"] = workspace_url
mlflow.set_tracking_uri(workspace_url)
mlflow.set_registry_uri("databricks-uc")

2. When loading the model, you might need to be more explicit:
# Try with the full model URI format
model_uri = f"models://{model_name}/{model_version}"
model_udf = mlflow.pyfunc.spark_udf(spark, model_uri, env_manager="conda")

3. For Spark environments, also ensure the configuration is properly set:
spark.conf.set("spark.databricks.mlflow.trackingUri", workspace_url)

This issue is often related to how the MLflow client resolves endpoints when working with Unity Catalog,
where it sometimes falls back to a default domain instead of using the workspace-specific URL you've provided.

 

 

LR

Dharma25
New Contributor III

Thanks for the answer. I will try this solution 🙂