10-17-2023 12:19 PM
Here is my model logging code.
mlflow.set_registry_uri("databricks-uc") with mlflow.start_run() as run: mlflow.transformers.log_model( transformers_model=pipeline, artifact_path="gpt2", registered_model_name=registered_model_name, input_example=input_example, signature=signature, task="text-generation", inference_config = inference_config, await_registration_for=60 * 60, )
And here is my registration code:
mlflow.set_registry_uri("databricks-uc") mlflow.set_tracking_uri("databricks") result = mlflow.register_model( model_uri="runs:/"+run.info.run_id+"/model", name=registered_name, await_registration_for=1000, )
Here is the full traceback, lightly edited.
MlflowException Traceback (most recent call last) File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/store/_unity_catalog/registry/rest_store.py:483, in UcModelRegistryStore._local_model_dir(self, source, local_model_path) 482 try: --> 483 local_model_dir = mlflow.artifacts.download_artifacts( 484 artifact_uri=source, tracking_uri=self.tracking_uri 485 ) 486 except Exception as e: File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/artifacts/__init__.py:60, in download_artifacts(artifact_uri, run_id, artifact_path, dst_path, tracking_uri) 59 if artifact_uri is not None: ---> 60 return _download_artifact_from_uri(artifact_uri, output_path=dst_path) 62 artifact_path = artifact_path if artifact_path is not None else "" File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/tracking/artifact_utils.py:100, in _download_artifact_from_uri(artifact_uri, output_path) 99 root_uri, artifact_path = _get_root_uri_and_artifact_path(artifact_uri) --> 100 return get_artifact_repository(artifact_uri=root_uri).download_artifacts( 101 artifact_path=artifact_path, dst_path=output_path 102 ) File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/store/artifact/artifact_repo.py:221, in ArtifactRepository.download_artifacts(self, artifact_path, dst_path) 218 failures = "\n".join( 219 template.format(path=path, error=error) for path, error in failed_downloads.items() 220 ) --> 221 raise MlflowException( 222 message=( 223 "The following failures occurred while downloading one or more" 224 f" artifacts from {self.artifact_uri}:\n{_truncate_error(failures)}" 225 ) 226 ) 228 return os.path.join(dst_path, artifact_path) MlflowException: The following failures occurred while downloading one or more artifacts from dbfs:/databricks/mlflow-tracking/.../artifacts: ##### File model ##### 404 Client Error: Not Found for url: https://$DATABRICKSURL/8188181812650195.jobs/mlflow-tracking/2982154088058434/a1cecbee2f8441c09f3fbe5d7a7587ff/artifacts/model... Response text: <?xml version="1.0" encoding="UTF-8"?> <Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>$PATH/8188181812650195.jobs/mlflow-tracking/2982154088058434/a1cecbee2f8441c09f3fbe5d7a7587ff/artifacts/model</Key><RequestId>$REQUESTID</RequestId><HostId>$HOSTID</HostId></Error> The above exception was the direct cause of the following exception: MlflowException Traceback (most recent call last) File <command-2982154088058438>, line 76 ---> 76 result = mlflow.register_model( 77 "runs:/"+run.info.run_id+"/model", 78 name=registered_name, 79 await_registration_for=1000, 80 ) 82 from mlflow import MlflowClient 83 client = MlflowClient(registry_uri="databricks-uc") File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/tracking/_model_registry/fluent.py:73, in register_model(model_uri, name, await_registration_for, tags) 17 def register_model( 18 model_uri, 19 name, (...) 22 tags: Optional[Dict[str, Any]] = None, 23 ) -> ModelVersion: 24 """ 25 Create a new model version in model registry for the model files specified by ``model_uri``. 26 Note that this method assumes the model registry backend URI is the same as that of the (...) 71 Version: 1 72 """ ---> 73 return _register_model( 74 model_uri=model_uri, name=name, await_registration_for=await_registration_for, tags=tags 75 ) File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/tracking/_model_registry/fluent.py:108, in _register_model(model_uri, name, await_registration_for, tags, local_model_path) 105 source = RunsArtifactRepository.get_underlying_uri(model_uri) 106 (run_id, _) = RunsArtifactRepository.parse_runs_uri(model_uri) --> 108 create_version_response = client._create_model_version( 109 name=name, 110 source=source, 111 run_id=run_id, 112 tags=tags, 113 await_creation_for=await_registration_for, 114 local_model_path=local_model_path, 115 ) 116 eprint( 117 f"Created version '{create_version_response.version}' of model " 118 f"'{create_version_response.name}'." 119 ) 120 return create_version_response File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/tracking/client.py:2575, in MlflowClient._create_model_version(self, name, source, run_id, tags, run_link, description, await_creation_for, local_model_path) 2567 # NOTE: we can't easily delete the target temp location due to the async nature 2568 # of the model version creation - printing to let the user know. 2569 eprint( 2570 f"=== Source model files were copied to {new_source}" 2571 + " in the model registry workspace. You may want to delete the files once the" 2572 + " model version is in 'READY' status. You can also find this location in the" 2573 + " `source` field of the created model version. ===" 2574 ) -> 2575 return self._get_registry_client().create_model_version( 2576 name=name, 2577 source=new_source, 2578 run_id=run_id, 2579 tags=tags, 2580 run_link=run_link, 2581 description=description, 2582 await_creation_for=await_creation_for, 2583 local_model_path=local_model_path, 2584 ) File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/tracking/_model_registry/client.py:196, in ModelRegistryClient.create_model_version(self, name, source, run_id, tags, run_link, description, await_creation_for, local_model_path) 194 arg_names = _get_arg_names(self.store.create_model_version) 195 if "local_model_path" in arg_names: --> 196 mv = self.store.create_model_version( 197 name, 198 source, 199 run_id, 200 tags, 201 run_link, 202 description, 203 local_model_path=local_model_path, 204 ) 205 else: 206 # Fall back to calling create_model_version without 207 # local_model_path since old model registry store implementations may not 208 # support the local_model_path argument. 209 mv = self.store.create_model_version(name, source, run_id, tags, run_link, description) File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/store/_unity_catalog/registry/rest_store.py:545, in UcModelRegistryStore.create_model_version(self, name, source, run_id, tags, run_link, description, local_model_path) 543 extra_headers = {_DATABRICKS_LINEAGE_ID_HEADER: header_base64} 544 full_name = get_full_name_from_sc(name, self.spark) --> 545 with self._local_model_dir(source, local_model_path) as local_model_dir: 546 self._validate_model_signature(local_model_dir) 547 feature_deps = get_feature_dependencies(local_model_dir) File /usr/lib/python3.10/contextlib.py:135, in _GeneratorContextManager.__enter__(self) 133 del self.args, self.kwds, self.func 134 try: --> 135 return next(self.gen) 136 except StopIteration: 137 raise RuntimeError("generator didn't yield") from None File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a56b0856-4b58-4270-93c1-f4e3d186cf4a/lib/python3.10/site-packages/mlflow/store/_unity_catalog/registry/rest_store.py:487, in UcModelRegistryStore._local_model_dir(self, source, local_model_path) 483 local_model_dir = mlflow.artifacts.download_artifacts( 484 artifact_uri=source, tracking_uri=self.tracking_uri 485 ) 486 except Exception as e: --> 487 raise MlflowException( 488 f"Unable to download model artifacts from source artifact location " 489 f"'{source}' in order to upload them to Unity Catalog. Please ensure " 490 f"the source artifact location exists and that you can download from " 491 f"it via mlflow.artifacts.download_artifacts()" 492 ) from e 493 # Clean up temporary model directory at end of block. We assume a temporary 494 # model directory was created if the `source` is not a local path (must be downloaded 495 # from remote to a temporary directory) 496 yield local_model_dir MlflowException: Unable to download model artifacts from source artifact location 'dbfs:/databricks/mlflow-tracking/2982154088058434/a1cecbee2f8441c09f3fbe5d7a7587ff/artifacts/model' in order to upload them to Unity Catalog. Please ensure the source artifact location exists and that you can download from it via mlflow.artifacts.download_artifacts()
When I open the DBFS file browser, I don't see any folder called 'databricks', so I decided to look through it with terminal commands. When I run %ls /dbfs/databricks/ I can see two directories: mlflow-registry and mlflow-tracking. When I run `%ls /dbfs/databricks/mlflow-tracking/` or %ls /dbfs/databricks/mlflow-registry/ though I get this error: mount.err*. Granted, I didn't try this with a Unity Catalog enabled cluster, but I don't think I need one to browse through DBFS. Also, at no point in the process do I mount a directory, but we are using Databricks through AWS, so that connection is probably where things are going wrong. I then tried using the full path straight from the error message: %ls /dbfs/databricks/mlflow-tracking/2982154088058434/a1cecbee2f8441c09f3fbe5d7a7587ff/artifacts/model and I got the error: ls: cannot access '/dbfs/databricks/mlflow-tracking/2982154088058434/a1cecbee2f8441c09f3fbe5d7a7587ff/artifacts/model': No such file or directory which suggests that perhaps the filepath actually does not exist after all! From here though I'm at a loss from what to do. I followed the Databricks example code located here and it worked, but for my model things get wonky. I am all out of ideas from where to go from here, so I'd really appreciate any and all tips.
12-30-2024 12:02 PM
I've experience the same error. The issue is that the model uri is not correct.
The model is logged with:
mlflow.transformers.log_model( ... , artifact_path="gpt2", ...)
The artifact_path is the last part of the model uri. If you don't specify it, it's "model" by default. Since you're specifying it explicitly here, the correct model uri is:
runs:/<run_id>/gpt2
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now