cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

FeatureEngineeringClient and Unity Catalog

Kjetil
New Contributor III

When testing this code

 

 

(
    fe.score_batch(
        df=dataset.drop("Target").limit(10),
        model_uri=f"models:/{model_name}/{mv.version}",
    )
    .select("prediction")
    .limit(10)
    .display()
)

 

 

I get the error:

 “MlflowException: The following failures occurred while downloading one or more artifacts”...'Connection to storageaccountname.blob.core.windows.net. timed out.’

This happens ONLY when i) models are registered in unity catalog (as opposed to the workspace) and ii) ONLY when using the FeatureEngineeringClient.

I have access to the data stored in the unity catalog (can read write to/from the cluster, can list files etc), and it works just fine when using the ML Flow library instead for the FeatureEngineeringClient, so it should work.

If I instead run with the model_uri f"runs:/{run_id}/model" I get another error:

“ValueError: default auth: cannot configure default credentials.”

To summarise:

  • Using the FeatureEngineeringClient to register and use models in Unity Catalog does NOT work
  • Using the ML Flow client to register, load and use models works perfectly with unity catalog
  • Using the FeatureEngineeringClient to register and use models in the workspace also works.

Runtime: DBR 14.3 LTS ML Spark 3.5.0

 

 

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group