- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-04-2024 09:38 AM
Hi all,
I have been using databricks-connect with serverless compute to develop and debug my databricks related code. It worked great so far. Now I started integrating ML-Flow in my workflow, and I am encountering an issue. When I run the following code, I get an exception out of the spark runtime.
import mlflow
import databricks.connect as db_connect
mlflow.login(). # This prints an INFO-log: Login successfull!
# mlflow.set_model_uri("databricks)
spark_ctx = db_connect.DatbricksSession.builder.serverless(True).getOrCreate()
train_and_log_ml_model(spark_ctx)
The error message is the following:
pyspark.errors.exceptions.connect.AnalysisException: [CONFIG_NOT_AVAILABLE] Configuration spark.mlflow.modelRegistryUri is not available. SQLSTATE: 42K0I
What am I missing? I there a way, to make it work?
Greetings, Daniel
P.S.: My environment is quite bare-bones: A new python-venv, where I pip installed `databricks-connect==15.1` and `mlflow`. I have configured the databricks-cli to use SSO, with a DEFAULT profile in the file `~/.databrickscfg`.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-04-2024 12:01 PM
The error you are encountering, pyspark.errors.exceptions.connect.AnalysisException: [CONFIG_NOT_AVAILABLE] Configuration spark.mlflow.modelRegistryUri is not available. SQLSTATE: 42K0I
, is a known issue when using MLflow with serverless clusters in Databricks. This issue arises because the configuration spark.mlflow.modelRegistryUri
is not set by default in serverless environments.
To resolve this issue, you can use a workaround that involves setting the registry URI manually. Here is a modified version of your code that includes this workaround:
import mlflow
import databricks.connect as db_connect
import mlflow.tracking._model_registry.utils
# Workaround to set the registry URI manually
mlflow.tracking._model_registry.utils._get_registry_uri_from_spark_session = lambda: "databricks-uc"
mlflow.login() # This prints an INFO-log: Login successful!
# mlflow.set_model_uri("databricks")
spark_ctx = db_connect.DatbricksSession.builder.serverless(True).getOrCreate()
train_and_log_ml_model(spark_ctx)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-04-2024 12:01 PM
The error you are encountering, pyspark.errors.exceptions.connect.AnalysisException: [CONFIG_NOT_AVAILABLE] Configuration spark.mlflow.modelRegistryUri is not available. SQLSTATE: 42K0I
, is a known issue when using MLflow with serverless clusters in Databricks. This issue arises because the configuration spark.mlflow.modelRegistryUri
is not set by default in serverless environments.
To resolve this issue, you can use a workaround that involves setting the registry URI manually. Here is a modified version of your code that includes this workaround:
import mlflow
import databricks.connect as db_connect
import mlflow.tracking._model_registry.utils
# Workaround to set the registry URI manually
mlflow.tracking._model_registry.utils._get_registry_uri_from_spark_session = lambda: "databricks-uc"
mlflow.login() # This prints an INFO-log: Login successful!
# mlflow.set_model_uri("databricks")
spark_ctx = db_connect.DatbricksSession.builder.serverless(True).getOrCreate()
train_and_log_ml_model(spark_ctx)
![](/skins/images/8C2A30E5B696B676846234E4B14F2C7B/responsive_peak/images/icon_anonymous_message.png)
![](/skins/images/8C2A30E5B696B676846234E4B14F2C7B/responsive_peak/images/icon_anonymous_message.png)