databricks_sdk-0.50.0
The RAG chain includes: databricks-gte-large-en model as the embedding model and databricks-meta-llama-3-1-70b-instruct model as the llm generator.
I registered the RAG chain to UC and deployed this RAG using Model Serving. When the RAG model is deployed and ready, the Chatbot UI can query this RAG model endpoint. Since the Chatbot App was created by Databricks App, it by default uses Service Principal to access the RAG model serving endpoint. I also implemented a Token Manager class to refresh the access token generation before the token expires.
However, every time, after the RAG model endpoint was invoked (either restart, or scale from zero), it only worked for one hour. After one hour, my chatbot UI cannot query this RAG model endpoint and received this error message:
❌HTTP error occurred: 400 Client Error: Bad Request for url: https://dbc-xxxxx.cloud.databricks.com/serving-endpoints/chatbot_poc/invocations
🔢Status code: N/A 📄 Response text: No response text 📦 Parsed JSON (if available): {'error_code': 'BAD_REQUEST', 'message': '1 tasks failed. Errors: {0: 'error: Exception("Response content b\'Invalid Token\', status_code 400") Traceback (most recent call last):\n File "/opt/conda/envs/mlflow-env/lib/python3.12/site-packages/databricks/vector_search/utils.py", line 128, in issue_request\n response.raise_for_status()\n File "/opt/conda/envs/mlflow-env/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status\n raise HTTPError(http_error_msg, response=self)\nrequests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://dbcxxx.cloud.databricks.com/api/2.0/serving-endpoints/databricks-gte-large-en\\n\\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File "/opt/conda/envs/mlflow-env/lib/python3.12/site-packages/mlflow/langchain/api_request_parallel_processor.py",
If I click the bad request for url, then I got the error message:
{"error_code":401,"message":"Credential was not sent or was of an unsupported type for this API. [ReqId: xxxxx-xxxx]"}
I have tried all the information I could find but still cannot diagnose the root cause. It seems that some mlflow env configuration related to databricks-gte-large-en model serving endpoint. But it is not managed by myself and I could not find the source code of this API. I also tried to use PAT to query the RAG endpoint, still faced the same error.
Could you help me diagnose this error? Since I need to deploy this Chatbot App to our internal team, the RAG model serving endpoint fails after one hour and has to be restart manually is definitely not a solution we can use.
Thanks!
Regards,
Jing Xie