cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Error during build process for serving model caused by detectron2

StephenDsouza
New Contributor II

Hi All,

Introduction: I am trying to register my model on Databricks so that I can serve it as an endpoint. The packages that I need are "torch""mlflow""torchvision""numpy" and "git+https://github.com/facebookresearch/detectron2.git". For this, I created a Notebook on Databricks and part of the code is as below:

 

class DetectronModel(mlflow.pyfunc.PythonModel):

    def load_context(self, context):
        self.predictor = joblib.load(context.artifacts["predictor"])

    def predict(self, context, model_input):
        image = np.array(model_input["image"])
        result = self.predictor(image)
        instances = result['instances']
        return instances
    

predictor_path = f"/dbfs/mnt/{container_name}/segmentationWeightsPath/predictor.pkl"

conda_env = {
    "channels": ["defaults"],
    "dependencies": [
        "python=3.8",
        "pip",
        {
            "pip": [
                "torch",
                "mlflow",
                "torchvision",
                "numpy",
                "git+https://github.com/facebookresearch/detectron2.git"
            ]
        }
    ]
}

model_info = mlflow.pyfunc.log_model(
    artifact_path="detectron_model_artifact",
    python_model=DetectronModel(),
    artifacts={"predictor": predictor_path},
    conda_env=conda_env
)

model_name = "detectron_model"
model_version = mlflow.register_model(
    model_uri=model_info.model_uri,
    name=model_name
)

Problem: Once the model is registered and I try to serve the model, the build process fails because of a `ModuleNotFoundError: No module named 'torch'` when detectron2 is being installed. Although in the conda_env, torch is clearly added so I am confused why I am getting the error.

I have attached the logs for reference.

#20 0.390 channels:
#20 0.390 - defaults
#20 0.390 dependencies:
#20 0.390 - python=3.8
#20 0.390 - pip
#20 0.390 - pip:
#20 0.390   - torch
#20 0.390   - mlflow
#20 0.390   - torchvision
#20 0.390   - numpy
#20 0.390   - git+https://github.com/facebookresearch/detectron2.git
#20 0.647 Collecting package metadata (repodata.json): ...working... done
#20 5.820 Solving environment: ...working... done
#20 6.192 
#20 6.192 
#20 6.192 ==> WARNING: A newer version of conda exists. <==
#20 6.192   current version: 4.10.3
#20 6.192   latest version: 24.5.0
#20 6.192 
#20 6.192 Please update conda by running
#20 6.192 
#20 6.192     $ conda update -n base -c defaults conda
#20 6.192 
#20 6.192 
#20 6.202 
#20 6.202 Downloading and Extracting Packages
#20 6.202 
ncurses-6.4          | 914 KB    |            |   0% 
ncurses-6.4          | 914 KB    | ########## | 100% 
ncurses-6.4          | 914 KB    | ########## | 100% 
#20 6.395 
zlib-1.2.13          | 111 KB    |            |   0% 
zlib-1.2.13          | 111 KB    | ########## | 100% 
#20 6.423 
libffi-3.4.4         | 141 KB    |            |   0% 
libffi-3.4.4         | 141 KB    | ########## | 100% 
#20 6.466 
ca-certificates-2024 | 127 KB    |            |   0% 
ca-certificates-2024 | 127 KB    | ########## | 100% 
#20 6.490 
readline-8.2         | 357 KB    |            |   0% 
readline-8.2         | 357 KB    | ########## | 100% 
#20 6.521 
wheel-0.43.0         | 109 KB    |            |   0% 
wheel-0.43.0         | 109 KB    | ########## | 100% 
#20 6.548 
_openmp_mutex-5.1    | 21 KB     |            |   0% 
_openmp_mutex-5.1    | 21 KB     | ########## | 100% 
#20 6.574 
libgomp-11.2.0       | 474 KB    |            |   0% 
libgomp-11.2.0       | 474 KB    | ########## | 100% 
#20 6.608 
ld_impl_linux-64-2.3 | 654 KB    |            |   0% 
ld_impl_linux-64-2.3 | 654 KB    | ########## | 100% 
#20 6.635 
setuptools-69.5.1    | 1002 KB   |            |   0% 
setuptools-69.5.1    | 1002 KB   | ########## | 100% 
#20 6.699 
python-3.8.19        | 23.8 MB   |            |   0% 
python-3.8.19        | 23.8 MB   | ########## | 100% 
python-3.8.19        | 23.8 MB   | ########## | 100% 
#20 7.085 
tk-8.6.14            | 3.4 MB    |            |   0% 
tk-8.6.14            | 3.4 MB    | ########## | 100% 
#20 7.172 
openssl-3.0.13       | 5.2 MB    |            |   0% 
openssl-3.0.13       | 5.2 MB    | ########## | 100% 
#20 7.267 
sqlite-3.45.3        | 1.2 MB    |            |   0% 
sqlite-3.45.3        | 1.2 MB    | ########## | 100% 
#20 7.304 
libgcc-ng-11.2.0     | 5.3 MB    |            |   0% 
libgcc-ng-11.2.0     | 5.3 MB    | ########## | 100% 
libgcc-ng-11.2.0     | 5.3 MB    | ########## | 100% 
#20 7.424 
xz-5.4.6             | 643 KB    |            |   0% 
xz-5.4.6             | 643 KB    | ########## | 100% 
#20 7.463 
libstdcxx-ng-11.2.0  | 4.7 MB    |            |   0% 
libstdcxx-ng-11.2.0  | 4.7 MB    | ########## | 100% 
#20 7.553 
pip-24.0             | 2.6 MB    |            |   0% 
pip-24.0             | 2.6 MB    | ########## | 100% 
pip-24.0             | 2.6 MB    | ########## | 100% 
#20 7.675 Preparing transaction: ...working... done
#20 7.841 Verifying transaction: ...working... done
#20 8.559 Executing transaction: ...working... done
#20 8.934 Installing pip dependencies: ...working... Pip subprocess error:
#20 10.81   Running command git clone --filter=blob:none --quiet https://github.com/facebookresearch/detectron2.git /tmp/pip-req-build-uc1k_xq1
#20 10.81   error: subprocess-exited-with-error
#20 10.81   
#20 10.81   × python setup.py egg_info did not run successfully.
#20 10.81   │ exit code: 1
#20 10.81   ╰─> [6 lines of output]
#20 10.81       Traceback (most recent call last):
#20 10.81         File "<string>", line 2, in <module>
#20 10.81         File "<pip-setuptools-caller>", line 34, in <module>
#20 10.81         File "/tmp/pip-req-build-uc1k_xq1/setup.py", line 10, in <module>
#20 10.81           import torch
#20 10.81       ModuleNotFoundError: No module named 'torch'
#20 10.81       [end of output]
#20 10.81   
#20 10.81   note: This error originates from a subprocess, and is likely not a problem with pip.
#20 10.81 error: metadata-generation-failed
#20 10.81 
#20 10.81 × Encountered error while generating package metadata.
#20 10.81 ╰─> See above for output.
#20 10.81 
#20 10.81 note: This is an issue with the package mentioned above, not pip.
#20 10.81 hint: See above for details.
#20 10.81 
#20 10.81 Ran pip subprocess with arguments:
#20 10.81 ['/opt/conda/envs/mlflow-env/bin/python', '-m', 'pip', 'install', '-U', '-r', '/model/condaenv.detud3st.requirements.txt']
#20 10.81 Pip subprocess output:
#20 10.81 Collecting git+https://github.com/facebookresearch/detectron2.git (from -r /model/condaenv.detud3st.requirements.txt (line 5))
#20 10.81   Cloning https://github.com/facebookresearch/detectron2.git to /tmp/pip-req-build-uc1k_xq1
#20 10.81   Resolved https://github.com/facebookresearch/detectron2.git to commit 79f914785a87b80565381f4489b129e633c4efb5
#20 10.81   Preparing metadata (setup.py): started
#20 10.81   Preparing metadata (setup.py): finished with status 'error'
#20 10.81 
#20 10.81 failed
#20 10.81 
#20 10.81 CondaEnvException: Pip failed
#20 10.81 
#20 ERROR: process "/bin/sh -c echo $BUILD_LOG_START_DELIMITER && cat model/conda.yaml && conda env create -f model/conda.yaml -n mlflow-env && echo $BUILD_LOG_CONDA_END_DELIMITER && echo $BUILD_LOG_END_DELIMITER && conda clean -afy" did not complete successfully: exit code: 1
------

It looks like the detectron2 is being triggered before torch.

I would like to get some support for my problem and would be happy to share more info.

1 REPLY 1

StephenDsouza
New Contributor II

Found an answer!

Basically pip was somehow installed the dependencies from the git repo first and was not following the given order so in order to solve this, I added the libraries for conda to install.

```
conda_env = {
    "channels": [
        "defaults",
        "pytorch"
    ],
    "dependencies": [
        "python=3.8",
        "numpy==1.24.3",
        "pytorch==2.2.2",
        "pip",
        {
            "pip": [
                "fvcore==0.1.5.post20221221",
                "git+https://github.com/wookayin/gpustat",
                "pycocotools==2.0.6",
                "torchvision==0.15.2",
                "git+https://github.com/facebookresearch/detectron2.git"
            ]
        }
    ]
}
```

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group