cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

Dbdemo: LLM Chatbot With Retrieval Augmented Generation (RAG)

cmunteanu
Contributor

Hello All,

I am trying to follow the dbdemo called 'llm-rag-chatbot' available at the following link. The setup works Ok, and I have imported from the Databricks Marketplace an embedding model that is:

  • bge_large_en_v1_5

Running the notebook called: 01-Data-Preparation-and-Index  I am stuck with an error when trying to create a Vector Search Index with Managed Embeddings and the BGE model that I have setup as a serving endpoint, previously. More specifically, the Vector Search endpoint provisions succesfully, but when executing the index creation and syncronization method: create_delta_sync_indexI get the following error:

----
Exception: Response content b'{"error_code":"INVALID_PARAMETER_VALUE","message":"Model serving endpoint bge-large-en configured with improper input: {\\"error_code\\": \\"BAD_REQUEST\\", \\"message\\": \\"Failed to enforce schema of data \' 0\\\\n0 Welcome to databricks vector search\' with schema \'[\'input\': string (required)]\'. Error: Model is missing inputs [\'input\']. Note that there were extra inputs: [0]\\"}"}', status_code 400
----
 
My code that calls this method is:
if not index_exists(vsc, VECTOR_SEARCH_ENDPOINT_NAME, vs_index_fullname):
  print(f"Creating index {vs_index_fullname} on endpoint {VECTOR_SEARCH_ENDPOINT_NAME}...")
  vsc.create_delta_sync_index(
    endpoint_name=VECTOR_SEARCH_ENDPOINT_NAME,
    index_name=vs_index_fullname,
    source_table_name=source_table_fullname,
    pipeline_type="TRIGGERED",
    primary_key="id",
    embedding_source_column='content', #The column containing our text
    embedding_model_endpoint_name='bge-large-en'
    #embedding_model_endpoint_name='gte_large' 
  )
I have tried changing to a different embedding model (GTE_LARGE), but still getting the above error.
I guess there is a incompatibilty between the input schema of the embedding model and the schema expected by the vector search endpoint.
 
Has any of you encountered this problem?  I would appreciate if you could give me a hint on how to solve it using an embedded model from Databricks Marketplace.
 
Thanks ! 
 

 

1 ACCEPTED SOLUTION

Accepted Solutions

cmunteanu
Contributor

Hello @Retired_mod , thanks a lot for the information you provided. Anyhow, I have managed a workaround, by pre-computing the embeddings for each chunk.  I have created an embedding column on the source table and used this column as input to the create_delta_sync_index method.

That is: substitute parameter  embedding_source_column='content' for:
embedding_dimension=1024,
embedding_vector_column="embedding"
and the syncronization of the index with the source table worked just fine.
 

View solution in original post

2 REPLIES 2

cmunteanu
Contributor

Hello @Retired_mod , thanks a lot for the information you provided. Anyhow, I have managed a workaround, by pre-computing the embeddings for each chunk.  I have created an embedding column on the source table and used this column as input to the create_delta_sync_index method.

That is: substitute parameter  embedding_source_column='content' for:
embedding_dimension=1024,
embedding_vector_column="embedding"
and the syncronization of the index with the source table worked just fine.
 

jbellidocaceres
New Contributor II
Hi  @Retired_mod  and  @cmunteanu  , I am having exactly the same problem to create the vector index and it seems that there could be a bug in the demo. What confuses me is that and even when using the Databricks UI, I can not manage to create the vector index. 
 
Well, when running the demo, it stays for a long time repeating:
============
Waiting for index to be ready, this can take a few min... {'detailed_state': 'PROVISIONING_INITIAL_SNAPSHOT', 'message': 'Index is currently is in the process of syncing initial data. Check latest status: https://adb-393322312342211.5.azuredatabricks.net/explore/data/dev_talk/llm_rag/databricks_documenta...', 'indexed_row_count': 0, 'provisioning_status': {'initial_pipeline_sync_progress': {'latest_version_currently_processing': 1, 'num_synced_rows': 0, 'total_rows_to_sync': 14129, 'sync_progress_completion': 0.0, 'pipeline_metrics': {'total_sync_time_per_row_ms': 0.0, 'ingestion_metrics': {'ingestion_time_per_row_ms': 0.0, 'ingestion_batch_size': 300}, 'embedding_metrics': {'embedding_generation_time_per_row_ms': 0.0, 'embedding_generation_batch_size': 0}}}}, 'ready': False, 'index_url': 'adb-393322312342211.5.azuredatabricks.net/api/2.0/vector-search/endpoints/dbdemos_vs_endpoint/indexes/dev_talk.llm_rag.databricks_documentation_vs_index'} - pipeline url:adb-393322312342211.5.azuredatabricks.net/api/2.0/vector-search/endpoints/dbdemos_vs_endpoint/indexes/dev_talk.llm_rag.databricks_documentation_vs_index

Then after a long time the Cell stops with the following error message:
 
 
 
 
It seems that the url is wrong (this is the bug I was referring), it has the endpoint and the vector index path interchanged. It should be:
 
 
Just like in the output of the Cell that is showing above. There, the URL is showed correctly,
================
 @Retired_mod If any specific configuration is required regarding the embedding model, it would be good to have it specified. In your reply you said:
  • When creating the Vector Search Index, ensure that you specify the correct parameters:
    • embedding_source_column: This should match the column name containing your text data (e.g., ‘content’).
    • embedding_model_endpoint_name: Use ‘bge-large-en’ as you’ve set up this model as a serving endpoint.
All these specifications are correctly configured in the demo notebook. So, I am confused on what is left for us to configure.
 @cmunteanu  I have followed your suggestion of using a self managed embedding to create the vector index. It does work, in the sense that I created the vector index. But, I can not use (easily) the nice features of Databricks vector_search client that converts internally text to vectors  and vice-versa.  Which make things easier for the RAG - chatbot. Have you got around that?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group