- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-04-2024 09:38 AM
Hi all,
I have been using databricks-connect with serverless compute to develop and debug my databricks related code. It worked great so far. Now I started integrating ML-Flow in my workflow, and I am encountering an issue. When I run the following code, I get an exception out of the spark runtime.
import mlflow
import databricks.connect as db_connect
mlflow.login(). # This prints an INFO-log: Login successfull!
# mlflow.set_model_uri("databricks)
spark_ctx = db_connect.DatbricksSession.builder.serverless(True).getOrCreate()
train_and_log_ml_model(spark_ctx)
The error message is the following:
pyspark.errors.exceptions.connect.AnalysisException: [CONFIG_NOT_AVAILABLE] Configuration spark.mlflow.modelRegistryUri is not available. SQLSTATE: 42K0I
What am I missing? I there a way, to make it work?
Greetings, Daniel
P.S.: My environment is quite bare-bones: A new python-venv, where I pip installed `databricks-connect==15.1` and `mlflow`. I have configured the databricks-cli to use SSO, with a DEFAULT profile in the file `~/.databrickscfg`.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-04-2024 12:01 PM
The error you are encountering, pyspark.errors.exceptions.connect.AnalysisException: [CONFIG_NOT_AVAILABLE] Configuration spark.mlflow.modelRegistryUri is not available. SQLSTATE: 42K0I, is a known issue when using MLflow with serverless clusters in Databricks. This issue arises because the configuration spark.mlflow.modelRegistryUri is not set by default in serverless environments.
To resolve this issue, you can use a workaround that involves setting the registry URI manually. Here is a modified version of your code that includes this workaround:
import mlflow
import databricks.connect as db_connect
import mlflow.tracking._model_registry.utils
# Workaround to set the registry URI manually
mlflow.tracking._model_registry.utils._get_registry_uri_from_spark_session = lambda: "databricks-uc"
mlflow.login() # This prints an INFO-log: Login successful!
# mlflow.set_model_uri("databricks")
spark_ctx = db_connect.DatbricksSession.builder.serverless(True).getOrCreate()
train_and_log_ml_model(spark_ctx)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-05-2025 06:37 AM
Is this a problem with the serverless compute resource, I tried using the approach you recommended but still run into some issues even though I successfully connect to the tracking server.