Model serving endpoint creation failed
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
3 weeks ago - last edited 3 weeks ago
I have a logged pyfunc mlflow model that runs without issues in a databricks notebook using
"mlflow.pyfunc.load_model()". I can deploy it without issues as a model serving endpoint with "workload_type" set to GPU, but when i try to update the endpoint to CPU it fails with this repeating error:
"[pb897] [2025-01-29 14:50:40 +0000] [4014] [INFO] Booting worker with pid: 4014 [pb897] [2025-01-29 14:50:42 +0000] [9] [ERROR] Worker (pid:3932) was sent code 132!"
"[pb897] [2025-01-29 14:50:40 +0000] [4014] [INFO] Booting worker with pid: 4014 [pb897] [2025-01-29 14:50:42 +0000] [9] [ERROR] Worker (pid:3932) was sent code 132!"
Why can the exact same configuration run on an environment with GPU but not on a CPU only environment?
I have also tried deleting the endpoint and try re-create it with the CPU config.
Labels:
- Labels:
-
Model Serving
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 weeks ago
The error encountered when updating the endpoint to a CPU-only configuration could be due to several reasons related to dependency and environment configuration mismatches:
• Dependency Mismatch: The error may be related to mismatched dependencies or environment configurations between the GPU and CPU environments. When the model was initially deployed with GPU support, it might have utilized dependencies specific to the GPU environment that are not compatible or missing in the CPU-only environment. This is often the case when certain libraries or dependencies are optimized for GPU usage and are not available or configured differently for CPU usage.
• Incompatible Python or Package Versions: The error could also stem from differences in Python versions or package versions (such as
cloudpickle
or pandas
) between the environments used for logging, deploying, and serving the model. Ensuring that the Python and package versions are consistent across all environments is critical, as discrepancies can lead to runtime errors.• Model Dependency Configuration: If the model's dependencies were not explicitly specified or captured correctly when logged, the serving environment might not have all the necessary packages. It's important to ensure that all required dependencies are included in the
requirements.txt
or conda.yaml
files when logging the model.• Recreating the Endpoint: Deleting and recreating the endpoint with a CPU configuration might not resolve the issue if the underlying problem with dependency or environment configuration persists. It is essential to validate and ensure the compatibility of the model and its dependencies with the CPU environment before redeploying.To address these issues:
1. Validate Dependencies: Ensure that all required dependencies are explicitly specified and compatible with the CPU environment.
2. Environment Consistency: Verify that the Python and package versions match those used during the model logging and...
...registration.
3. Test Locally: Test the model in a local CPU environment to identify any dependency issues before deploying.
3. Test Locally: Test the model in a local CPU environment to identify any dependency issues before deploying.

