cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results for 
Search instead for 
Did you mean: 

MLlib load from UC Volume: IllegalArgumentException: Cannot access the UC Volume path...

stiaangerber
New Contributor III

I'm trying to store MLlib instances in Unity Catalog Volumes. I think volumes are a great way to keep things organized.

I can save to a volume without any issues and I can access the data using spark.read and with plain python open(). However, when I try to load a saved MLlib instance using MLReader.load, I get:

 

IllegalArgumentException: Cannot access the UC Volume path from this location.

 

See the attached demo. I'm a metastore admin and owner of the volume.

Am I missing something? Is this not supported (yet)? Can I make it work somehow?

I've tested this on DBR 13.3 and 14.2, same thing. Any help will be much appreciated.

3 REPLIES 3

Thanks @Retired_mod. For now, I implemented a workaround. For others with a similar issue, this works:

def load_model(mlclass, uc_vol_path):
    try:
        tmpdir = f'/FileStore/{uuid.uuid4().hex}'
        dbutils.fs.cp(uc_vol_path, tmpdir, recurse=True)
        model = mlclass.load(tmpdir)
        return model
    finally:
        dbutils.fs.rm(tmpdir, recurse=True)

dct2 = load_model(DCT, '/Volumes/demo/default/vol/dct')

stiaangerber
New Contributor III

Actually, the above is no good for models with associated data. (You can only delete the tmpdir after your done using the model)

slimexy
New Contributor II

Just to supplement that if the ML model is saved and then loaded within the same execution, calling load() will not cause the mentioned exception. Copying the model directory from UC volume to ephemeral storage attached to the driver node is also a work around (without the need to delete the tmpdir in DBFS after loading the model), but works in single node mode only.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group