cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Using Datbricks Connect with serverless compute and MLflow

DaPo
New Contributor II

Hi all,

I have been using databricks-connect with serverless compute to develop and debug my databricks related code. It worked great so far. Now I started integrating ML-Flow in my workflow, and I am encountering an issue. When I run the following code, I get an exception out of the spark runtime.

 

 

import mlflow
import databricks.connect as db_connect

mlflow.login(). # This prints an INFO-log: Login successfull!
# mlflow.set_model_uri("databricks)
spark_ctx = db_connect.DatbricksSession.builder.serverless(True).getOrCreate()
train_and_log_ml_model(spark_ctx)

 

 

 The error message is the following:

 

 

pyspark.errors.exceptions.connect.AnalysisException: [CONFIG_NOT_AVAILABLE] Configuration spark.mlflow.modelRegistryUri is not available. SQLSTATE: 42K0I

 

 

What am I missing? I there a way, to make it work?

Greetings, Daniel

P.S.: My environment is quite bare-bones: A new python-venv, where I pip installed `databricks-connect==15.1` and `mlflow`. I have configured the databricks-cli to use SSO, with a DEFAULT profile in the file `~/.databrickscfg`.

1 ACCEPTED SOLUTION

Accepted Solutions

Walter_C
Databricks Employee
Databricks Employee

The error you are encountering, pyspark.errors.exceptions.connect.AnalysisException: [CONFIG_NOT_AVAILABLE] Configuration spark.mlflow.modelRegistryUri is not available. SQLSTATE: 42K0I, is a known issue when using MLflow with serverless clusters in Databricks. This issue arises because the configuration spark.mlflow.modelRegistryUri is not set by default in serverless environments.

To resolve this issue, you can use a workaround that involves setting the registry URI manually. Here is a modified version of your code that includes this workaround:

import mlflow
import databricks.connect as db_connect
import mlflow.tracking._model_registry.utils

# Workaround to set the registry URI manually
mlflow.tracking._model_registry.utils._get_registry_uri_from_spark_session = lambda: "databricks-uc"

mlflow.login() # This prints an INFO-log: Login successful!
# mlflow.set_model_uri("databricks")
spark_ctx = db_connect.DatbricksSession.builder.serverless(True).getOrCreate()
train_and_log_ml_model(spark_ctx)

View solution in original post

1 REPLY 1

Walter_C
Databricks Employee
Databricks Employee

The error you are encountering, pyspark.errors.exceptions.connect.AnalysisException: [CONFIG_NOT_AVAILABLE] Configuration spark.mlflow.modelRegistryUri is not available. SQLSTATE: 42K0I, is a known issue when using MLflow with serverless clusters in Databricks. This issue arises because the configuration spark.mlflow.modelRegistryUri is not set by default in serverless environments.

To resolve this issue, you can use a workaround that involves setting the registry URI manually. Here is a modified version of your code that includes this workaround:

import mlflow
import databricks.connect as db_connect
import mlflow.tracking._model_registry.utils

# Workaround to set the registry URI manually
mlflow.tracking._model_registry.utils._get_registry_uri_from_spark_session = lambda: "databricks-uc"

mlflow.login() # This prints an INFO-log: Login successful!
# mlflow.set_model_uri("databricks")
spark_ctx = db_connect.DatbricksSession.builder.serverless(True).getOrCreate()
train_and_log_ml_model(spark_ctx)

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group