cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

FeatureEngineeringClient and Unity Catalog

Kjetil
Contributor

When testing this code

 

 

(
    fe.score_batch(
        df=dataset.drop("Target").limit(10),
        model_uri=f"models:/{model_name}/{mv.version}",
    )
    .select("prediction")
    .limit(10)
    .display()
)

 

 

I get the error:

 “MlflowException: The following failures occurred while downloading one or more artifacts”...'Connection to storageaccountname.blob.core.windows.net. timed out.’

This happens ONLY when i) models are registered in unity catalog (as opposed to the workspace) and ii) ONLY when using the FeatureEngineeringClient.

I have access to the data stored in the unity catalog (can read write to/from the cluster, can list files etc), and it works just fine when using the ML Flow library instead for the FeatureEngineeringClient, so it should work.

If I instead run with the model_uri f"runs:/{run_id}/model" I get another error:

“ValueError: default auth: cannot configure default credentials.”

To summarise:

  • Using the FeatureEngineeringClient to register and use models in Unity Catalog does NOT work
  • Using the ML Flow client to register, load and use models works perfectly with unity catalog
  • Using the FeatureEngineeringClient to register and use models in the workspace also works.

Runtime: DBR 14.3 LTS ML Spark 3.5.0

 

 

1 REPLY 1

mark_ott
Databricks Employee
Databricks Employee

Your issues are tied to authentication and network/configuration differences between Unity Catalog and Workspace models in Databricks, specifically when using the FeatureEngineeringClient.

Key Issues

  • FeatureEngineeringClient + Unity Catalog: You get "MlflowException: Connection to storageaccountname.blob.core.windows.net timed out." This is a known problem often related to private endpoint configuration and/or lack of correct authentication for artifact access in Azure Storage.​

  • FeatureEngineeringClient + workspace models: Works fine.

  • MLflow Client + Unity Catalog: Works fine, pointing to authentication/config differences between the two clients.

  • model_uri = "runs:/{run_id}/model": Causes "ValueError: default auth: cannot configure default credentials." This indicates the client can't resolve Azure credentials or configuration to access the storage.​

Likely Causes & Solutions

Azure Storage Permissions (Unity Catalog)

  • Unity Catalog models/artifacts in Azure use Azure Storage and require both DFS and Blob private endpoints for access, something the FeatureEngineeringClient tries to access directly. If only DFS or only Blob endpoints are configured, you will get timeouts or authorization failures.​

  • Check that your storage account has private endpoints for both Blob and DFS, and that these resolve (try nslookup storage_account_name.blob.core.windows.net in a Databricks notebook).​

Authentication & Credentials

  • "default auth: cannot configure default credentials" means your current context lacks necessary credentials for ML artifacts. Common fixes:

    • Make sure your .databricks.cfg file is set up with a default profile including a Databricks PAT token, or with service principal authentication.​

    • For cluster mode, verify your managed identity or service principal has permission to access storage and the Unity Catalog.

    • MLflow client uses different auth flows; FeatureEngineeringClient may expect SDK or environment-based unified authentication.

Client Differences

  • MLflow may fallback to workspace-centric authentication flows that work for Unity Catalog, whereas FeatureEngineeringClient is stricter or calls Azure APIs directly.

  • Check the Databricks docs for unified authentication for clients and ensure your setup matches what's required for Unity Catalog – especially for non-interactive runs.​

Troubleshooting Checklist

  • Verify network connectivity and endpoint configuration for both Blob and DFS storage in Azure.

  • Confirm you have valid authentication: either Databricks token, service principal, or managed identity in cluster context.

  • Check that your .databricks.cfg contains a proper [DEFAULT] profile.

  • Make sure your cluster/node has required permissions to Unity Catalog storage.

  • If still blocked, try using MLflow client for serving or accessing Unity Catalog models.

You may also need to consult Databricks support or monitor product release notes for FeatureEngineeringClient, as there are ongoing improvements in Azure/Unity Catalog integration.

For detailed credential setup, see Databricks' unified authentication documentation.​

If you need command examples for checking endpoints or configuring .databricks.cfg, specify your environment further and that can be providedovided.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now