Hi All,
Introduction: I am trying to register my model on Databricks so that I can serve it as an endpoint. The packages that I need are "torch", "mlflow", "torchvision", "numpy" and "git+https://github.com/facebookresearch/detectron2.git". For this, I created a Notebook on Databricks and part of the code is as below:
class DetectronModel(mlflow.pyfunc.PythonModel):
def load_context(self, context):
self.predictor = joblib.load(context.artifacts["predictor"])
def predict(self, context, model_input):
image = np.array(model_input["image"])
result = self.predictor(image)
instances = result['instances']
return instances
predictor_path = f"/dbfs/mnt/{container_name}/segmentationWeightsPath/predictor.pkl"
conda_env = {
"channels": ["defaults"],
"dependencies": [
"python=3.8",
"pip",
{
"pip": [
"torch",
"mlflow",
"torchvision",
"numpy",
"git+https://github.com/facebookresearch/detectron2.git"
]
}
]
}
model_info = mlflow.pyfunc.log_model(
artifact_path="detectron_model_artifact",
python_model=DetectronModel(),
artifacts={"predictor": predictor_path},
conda_env=conda_env
)
model_name = "detectron_model"
model_version = mlflow.register_model(
model_uri=model_info.model_uri,
name=model_name
)
Problem: Once the model is registered and I try to serve the model, the build process fails because of a `ModuleNotFoundError: No module named 'torch'` when detectron2 is being installed. Although in the conda_env, torch is clearly added so I am confused why I am getting the error.
I have attached the logs for reference.
#20 0.390 channels:
#20 0.390 - defaults
#20 0.390 dependencies:
#20 0.390 - python=3.8
#20 0.390 - pip
#20 0.390 - pip:
#20 0.390 - torch
#20 0.390 - mlflow
#20 0.390 - torchvision
#20 0.390 - numpy
#20 0.390 - git+https://github.com/facebookresearch/detectron2.git
#20 0.647 Collecting package metadata (repodata.json): ...working... done
#20 5.820 Solving environment: ...working... done
#20 6.192
#20 6.192
#20 6.192 ==> WARNING: A newer version of conda exists. <==
#20 6.192 current version: 4.10.3
#20 6.192 latest version: 24.5.0
#20 6.192
#20 6.192 Please update conda by running
#20 6.192
#20 6.192 $ conda update -n base -c defaults conda
#20 6.192
#20 6.192
#20 6.202
#20 6.202 Downloading and Extracting Packages
#20 6.202
ncurses-6.4 | 914 KB | | 0%
ncurses-6.4 | 914 KB | ########## | 100%
ncurses-6.4 | 914 KB | ########## | 100%
#20 6.395
zlib-1.2.13 | 111 KB | | 0%
zlib-1.2.13 | 111 KB | ########## | 100%
#20 6.423
libffi-3.4.4 | 141 KB | | 0%
libffi-3.4.4 | 141 KB | ########## | 100%
#20 6.466
ca-certificates-2024 | 127 KB | | 0%
ca-certificates-2024 | 127 KB | ########## | 100%
#20 6.490
readline-8.2 | 357 KB | | 0%
readline-8.2 | 357 KB | ########## | 100%
#20 6.521
wheel-0.43.0 | 109 KB | | 0%
wheel-0.43.0 | 109 KB | ########## | 100%
#20 6.548
_openmp_mutex-5.1 | 21 KB | | 0%
_openmp_mutex-5.1 | 21 KB | ########## | 100%
#20 6.574
libgomp-11.2.0 | 474 KB | | 0%
libgomp-11.2.0 | 474 KB | ########## | 100%
#20 6.608
ld_impl_linux-64-2.3 | 654 KB | | 0%
ld_impl_linux-64-2.3 | 654 KB | ########## | 100%
#20 6.635
setuptools-69.5.1 | 1002 KB | | 0%
setuptools-69.5.1 | 1002 KB | ########## | 100%
#20 6.699
python-3.8.19 | 23.8 MB | | 0%
python-3.8.19 | 23.8 MB | ########## | 100%
python-3.8.19 | 23.8 MB | ########## | 100%
#20 7.085
tk-8.6.14 | 3.4 MB | | 0%
tk-8.6.14 | 3.4 MB | ########## | 100%
#20 7.172
openssl-3.0.13 | 5.2 MB | | 0%
openssl-3.0.13 | 5.2 MB | ########## | 100%
#20 7.267
sqlite-3.45.3 | 1.2 MB | | 0%
sqlite-3.45.3 | 1.2 MB | ########## | 100%
#20 7.304
libgcc-ng-11.2.0 | 5.3 MB | | 0%
libgcc-ng-11.2.0 | 5.3 MB | ########## | 100%
libgcc-ng-11.2.0 | 5.3 MB | ########## | 100%
#20 7.424
xz-5.4.6 | 643 KB | | 0%
xz-5.4.6 | 643 KB | ########## | 100%
#20 7.463
libstdcxx-ng-11.2.0 | 4.7 MB | | 0%
libstdcxx-ng-11.2.0 | 4.7 MB | ########## | 100%
#20 7.553
pip-24.0 | 2.6 MB | | 0%
pip-24.0 | 2.6 MB | ########## | 100%
pip-24.0 | 2.6 MB | ########## | 100%
#20 7.675 Preparing transaction: ...working... done
#20 7.841 Verifying transaction: ...working... done
#20 8.559 Executing transaction: ...working... done
#20 8.934 Installing pip dependencies: ...working... Pip subprocess error:
#20 10.81 Running command git clone --filter=blob:none --quiet https://github.com/facebookresearch/detectron2.git /tmp/pip-req-build-uc1k_xq1
#20 10.81 error: subprocess-exited-with-error
#20 10.81
#20 10.81 × python setup.py egg_info did not run successfully.
#20 10.81 │ exit code: 1
#20 10.81 ╰─> [6 lines of output]
#20 10.81 Traceback (most recent call last):
#20 10.81 File "<string>", line 2, in <module>
#20 10.81 File "<pip-setuptools-caller>", line 34, in <module>
#20 10.81 File "/tmp/pip-req-build-uc1k_xq1/setup.py", line 10, in <module>
#20 10.81 import torch
#20 10.81 ModuleNotFoundError: No module named 'torch'
#20 10.81 [end of output]
#20 10.81
#20 10.81 note: This error originates from a subprocess, and is likely not a problem with pip.
#20 10.81 error: metadata-generation-failed
#20 10.81
#20 10.81 × Encountered error while generating package metadata.
#20 10.81 ╰─> See above for output.
#20 10.81
#20 10.81 note: This is an issue with the package mentioned above, not pip.
#20 10.81 hint: See above for details.
#20 10.81
#20 10.81 Ran pip subprocess with arguments:
#20 10.81 ['/opt/conda/envs/mlflow-env/bin/python', '-m', 'pip', 'install', '-U', '-r', '/model/condaenv.detud3st.requirements.txt']
#20 10.81 Pip subprocess output:
#20 10.81 Collecting git+https://github.com/facebookresearch/detectron2.git (from -r /model/condaenv.detud3st.requirements.txt (line 5))
#20 10.81 Cloning https://github.com/facebookresearch/detectron2.git to /tmp/pip-req-build-uc1k_xq1
#20 10.81 Resolved https://github.com/facebookresearch/detectron2.git to commit 79f914785a87b80565381f4489b129e633c4efb5
#20 10.81 Preparing metadata (setup.py): started
#20 10.81 Preparing metadata (setup.py): finished with status 'error'
#20 10.81
#20 10.81 failed
#20 10.81
#20 10.81 CondaEnvException: Pip failed
#20 10.81
#20 ERROR: process "/bin/sh -c echo $BUILD_LOG_START_DELIMITER && cat model/conda.yaml && conda env create -f model/conda.yaml -n mlflow-env && echo $BUILD_LOG_CONDA_END_DELIMITER && echo $BUILD_LOG_END_DELIMITER && conda clean -afy" did not complete successfully: exit code: 1
------
It looks like the detectron2 is being triggered before torch.
I would like to get some support for my problem and would be happy to share more info.