cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

MlflowException: Unable to download model artifacts in Databricks while registering model with MLflo

AChang
New Contributor III
I am attempting to log, register, and deploy a finetuned GPT2 model in Databricks. While I have been able to get my logging code to run, when I try to run my registration code, I get an MlflowException error.

Here is my model logging code.

mlflow.set_registry_uri("databricks-uc")

with mlflow.start_run() as run:
    mlflow.transformers.log_model(
        transformers_model=pipeline,
        artifact_path="gpt2",
        registered_model_name=registered_model_name,
        input_example=input_example, 
        signature=signature,
        task="text-generation",
        inference_config = inference_config,
        await_registration_for=60 * 60,
    )

And here is my registration code:

mlflow.set_registry_uri("databricks-uc")
mlflow.set_tracking_uri("databricks")

result = mlflow.register_model(
    model_uri="runs:/"+run.info.run_id+"/model",
    name=registered_name,
    await_registration_for=1000,
)

Here is the full traceback, lightly edited.

MlflowException                           Traceback (most recent call last)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/store/_unity_catalog/registry/rest_store.py:483, in UcModelRegistryStore._local_model_dir(self, source, local_model_path)
    482 try:
--> 483     local_model_dir = mlflow.artifacts.download_artifacts(
    484         artifact_uri=source, tracking_uri=self.tracking_uri
    485     )
    486 except Exception as e:

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/artifacts/__init__.py:60, in download_artifacts(artifact_uri, run_id, artifact_path, dst_path, tracking_uri)
     59 if artifact_uri is not None:
---> 60     return _download_artifact_from_uri(artifact_uri, output_path=dst_path)
     62 artifact_path = artifact_path if artifact_path is not None else ""

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/tracking/artifact_utils.py:100, in _download_artifact_from_uri(artifact_uri, output_path)
     99 root_uri, artifact_path = _get_root_uri_and_artifact_path(artifact_uri)
--> 100 return get_artifact_repository(artifact_uri=root_uri).download_artifacts(
    101     artifact_path=artifact_path, dst_path=output_path
    102 )

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/store/artifact/artifact_repo.py:221, in ArtifactRepository.download_artifacts(self, artifact_path, dst_path)
    218     failures = "\n".join(
    219         template.format(path=path, error=error) for path, error in failed_downloads.items()
    220     )
--> 221     raise MlflowException(
    222         message=(
    223             "The following failures occurred while downloading one or more"
    224             f" artifacts from {self.artifact_uri}:\n{_truncate_error(failures)}"
    225         )
    226     )
    228 return os.path.join(dst_path, artifact_path)

MlflowException: The following failures occurred while downloading one or more artifacts from dbfs:/databricks/mlflow-tracking/.../artifacts:
##### File model #####
404 Client Error: Not Found for url: https://$DATABRICKSURL/8188181812650195.jobs/mlflow-tracking/2982154088058434/a1cecbee2f8441c09f3fbe5d7a7587ff/artifacts/model... Response text: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>$PATH/8188181812650195.jobs/mlflow-tracking/2982154088058434/a1cecbee2f8441c09f3fbe5d7a7587ff/artifacts/model</Key><RequestId>$REQUESTID</RequestId><HostId>$HOSTID</HostId></Error>

The above exception was the direct cause of the following exception:

MlflowException                           Traceback (most recent call last)
File <command-2982154088058438>, line 76
---> 76 result = mlflow.register_model(
     77     "runs:/"+run.info.run_id+"/model",
     78     name=registered_name,
     79     await_registration_for=1000,
     80 )
     82 from mlflow import MlflowClient
     83 client = MlflowClient(registry_uri="databricks-uc")

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/tracking/_model_registry/fluent.py:73, in register_model(model_uri, name, await_registration_for, tags)
     17 def register_model(
     18     model_uri,
     19     name,
   (...)
     22     tags: Optional[Dict[str, Any]] = None,
     23 ) -> ModelVersion:
     24     """
     25     Create a new model version in model registry for the model files specified by ``model_uri``.
     26     Note that this method assumes the model registry backend URI is the same as that of the
   (...)
     71         Version: 1
     72     """
---> 73     return _register_model(
     74         model_uri=model_uri, name=name, await_registration_for=await_registration_for, tags=tags
     75     )

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/tracking/_model_registry/fluent.py:108, in _register_model(model_uri, name, await_registration_for, tags, local_model_path)
    105     source = RunsArtifactRepository.get_underlying_uri(model_uri)
    106     (run_id, _) = RunsArtifactRepository.parse_runs_uri(model_uri)
--> 108 create_version_response = client._create_model_version(
    109     name=name,
    110     source=source,
    111     run_id=run_id,
    112     tags=tags,
    113     await_creation_for=await_registration_for,
    114     local_model_path=local_model_path,
    115 )
    116 eprint(
    117     f"Created version '{create_version_response.version}' of model "
    118     f"'{create_version_response.name}'."
    119 )
    120 return create_version_response

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/tracking/client.py:2575, in MlflowClient._create_model_version(self, name, source, run_id, tags, run_link, description, await_creation_for, local_model_path)
   2567     # NOTE: we can't easily delete the target temp location due to the async nature
   2568     # of the model version creation - printing to let the user know.
   2569     eprint(
   2570         f"=== Source model files were copied to {new_source}"
   2571         + " in the model registry workspace. You may want to delete the files once the"
   2572         + " model version is in 'READY' status. You can also find this location in the"
   2573         + " `source` field of the created model version. ==="
   2574     )
-> 2575 return self._get_registry_client().create_model_version(
   2576     name=name,
   2577     source=new_source,
   2578     run_id=run_id,
   2579     tags=tags,
   2580     run_link=run_link,
   2581     description=description,
   2582     await_creation_for=await_creation_for,
   2583     local_model_path=local_model_path,
   2584 )

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/tracking/_model_registry/client.py:196, in ModelRegistryClient.create_model_version(self, name, source, run_id, tags, run_link, description, await_creation_for, local_model_path)
    194 arg_names = _get_arg_names(self.store.create_model_version)
    195 if "local_model_path" in arg_names:
--> 196     mv = self.store.create_model_version(
    197         name,
    198         source,
    199         run_id,
    200         tags,
    201         run_link,
    202         description,
    203         local_model_path=local_model_path,
    204     )
    205 else:
    206     # Fall back to calling create_model_version without
    207     # local_model_path since old model registry store implementations may not
    208     # support the local_model_path argument.
    209     mv = self.store.create_model_version(name, source, run_id, tags, run_link, description)

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/store/_unity_catalog/registry/rest_store.py:545, in UcModelRegistryStore.create_model_version(self, name, source, run_id, tags, run_link, description, local_model_path)
    543     extra_headers = {_DATABRICKS_LINEAGE_ID_HEADER: header_base64}
    544 full_name = get_full_name_from_sc(name, self.spark)
--> 545 with self._local_model_dir(source, local_model_path) as local_model_dir:
    546     self._validate_model_signature(local_model_dir)
    547     feature_deps = get_feature_dependencies(local_model_dir)

File /usr/lib/python3.10/contextlib.py:135, in _GeneratorContextManager.__enter__(self)
    133 del self.args, self.kwds, self.func
    134 try:
--> 135     return next(self.gen)
    136 except StopIteration:
    137     raise RuntimeError("generator didn't yield") from None

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/store/_unity_catalog/registry/rest_store.py:487, in UcModelRegistryStore._local_model_dir(self, source, local_model_path)
    483     local_model_dir = mlflow.artifacts.download_artifacts(
    484         artifact_uri=source, tracking_uri=self.tracking_uri
    485     )
    486 except Exception as e:
--> 487     raise MlflowException(
    488         f"Unable to download model artifacts from source artifact location "
    489         f"'{source}' in order to upload them to Unity Catalog. Please ensure "
    490         f"the source artifact location exists and that you can download from "
    491         f"it via mlflow.artifacts.download_artifacts()"
    492     ) from e
    493 # Clean up temporary model directory at end of block. We assume a temporary
    494 # model directory was created if the `source` is not a local path (must be downloaded
    495 # from remote to a temporary directory)
    496 yield local_model_dir

MlflowException: Unable to download model artifacts from source artifact location 'dbfs:/databricks/mlflow-tracking/2982154088058434/a1cecbee2f8441c09f3fbe5d7a7587ff/artifacts/model' in order to upload them to Unity Catalog. Please ensure the source artifact location exists and that you can download from it via mlflow.artifacts.download_artifacts()

When I open the DBFS file browser, I don't see any folder called 'databricks', so I decided to look through it with terminal commands. When I run %ls /dbfs/databricks/ I can see two directories: mlflow-registry and mlflow-tracking. When I run `%ls /dbfs/databricks/mlflow-tracking/` or %ls /dbfs/databricks/mlflow-registry/ though I get this error: mount.err*. Granted, I didn't try this with a Unity Catalog enabled cluster, but I don't think I need one to browse through DBFS. Also, at no point in the process do I mount a directory, but we are using Databricks through AWS, so that connection is probably where things are going wrong. I then tried using the full path straight from the error message: %ls /dbfs/databricks/mlflow-tracking/2982154088058434/a1cecbee2f8441c09f3fbe5d7a7587ff/artifacts/model and I got the error: ls: cannot access '/dbfs/databricks/mlflow-tracking/2982154088058434/a1cecbee2f8441c09f3fbe5d7a7587ff/artifacts/model': No such file or directory which suggests that perhaps the filepath actually does not exist after all! From here though I'm at a loss from what to do. I followed the Databricks example code located here and it worked, but for my model things get wonky. I am all out of ideas from where to go from here, so I'd really appreciate any and all tips.

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @AChangI suggest creating a new Databricks cluster and running your code to see if the issue is specific to your current cluster configuration.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group