cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Failure when deploying a custom serving endpoint LLLM

ismaelhenzel
Contributor

I'm currently experimenting with vector search using Databricks. Everything runs smoothly when I load the model deployed in Unity Catalog into a notebook session and ask questions using Python. However, when I attempt to serve it, I encounter a generic error.

ismaelhenzel_0-1715091515103.png

The container builds successfully. However, upon running the code, I encounter an error. Debugging this serve endpoint is challenging because the machine is not available for an interactive session. I've observed that the error occurs when the code attempts to retrieve the index or endpoint, as shown below:

  • vsc.get_endpoint(name=vector_search_endpoint_name) returns  An error occurred while loading the model. Expecting value: line 1 column 1 (char 0).
  • index = vsc.get_index(vector_search_endpoint_name, vs_index_fullname) returns  An error occurred while loading the model. Expecting value: line 1 column 1 (char 0)

** my workspace is fully private

** i'm basing my code in this databricks example https://notebooks.databricks.com/demos/llm-rag-chatbot/index.html#

3 REPLIES 3

ismaelhenzel
Contributor

i'm using openai text-embedding-3-large as embedding model, dbrx as chat model, and databricks as vectorstore, everything deployed and working fine in the workspace. But for some reason, error trying to serve the model, my unity catalog is aready in a public storage for this poc because the serve didn't support the firewall of a private storage.

srikanth009
New Contributor II

Even I am face similar challenges of debugging this serve endpoint due to non interactive session. Looking for any alternate solutions. It was running perfectly when logging and loading the model in databricks, but shows errors after creating an endpoint and while querying it.

brycejune
New Contributor III

Ensure your vector_search_endpoint_name and vs_index_fullname match the deployment setup. Check model deployment logs for detailed errors and confirm your workspace's network settings allow access to Unity Catalog and model serving endpoints in a private workspace.
Regards,

Bryce June

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group