Serving endpoints: model server failed to load the model: the file bash was not found: uknown

tessaickx
New Contributor III

While trying to create a serving endpoint with my custom model, I get a "Failed" state:

Model server failed to load the model. Please see service logs for more information.

The service logs show the following:

Container failed with: failed to create containerd task: failed to create shim task: the file bash was not found: unknown

To me it is unclear what could be wrong. The model is a custom PythonModel, where a 

ConversationalRetrievalChain from langchain and a faiss vector store (in artifacts the path to dbfs is given) are used.
 
Does anyone have an idea about what could be causing this error? 
 

tessaickx
New Contributor III

Update: 

After minor changes in the code, my endpoint failed again but with a different error:

tessaickx_0-1694165345862.png

Does anyone have any ideas?

ravi-malipeddi
New Contributor II

I have faced the similar issue. still didn't find the right solution. In my case, the below is the error trace i found from service logs. Not sure where the issue could be

"An error occurred while loading the model. You haven't configured the CLI yet! Please configure by entering `/opt/conda/envs/mlflow-env/bin/gunicorn configure`."

These are the few things i have tried after did some research.

  • Make sure the cluster ML compatible DBR, i used the 13.3.
  • Added all the libraries which required to the cluster directly in the cluster configuration.
  • Pip dependencies has been logged with the langchain model.
  • Tried to manually point out mlflow env in cluster configuration as per some of the blogs, but the UI is not showing the options like that.
  • Tried different compute combinations of end point configurations.
  • Configured the databricks CLI inside the cluster and make sure it has setup properly