topic Re: Dbdemo: LLM Chatbot With Retrieval Augmented Generation (RAG) in Generative AI

Dbdemo: LLM Chatbot With Retrieval Augmented Generation (RAG)

cmunteanu — Fri, 22 Mar 2024 11:35:03 GMT

Hello All,

I am trying to follow the dbdemo called 'llm-rag-chatbot' available at the following link. The setup works Ok, and I have imported from the Databricks Marketplace an embedding model that is:

bge_large_en_v1_5

Running the notebook called: 01-Data-Preparation-and-Index I am stuck with an error when trying to create a Vector Search Index with Managed Embeddings and the BGE model that I have setup as a serving endpoint, previously. More specifically, the Vector Search endpoint provisions succesfully, but when executing the index creation and syncronization method: create_delta_sync_index, I get the following error:

----

Exception: Response content b'{"error_code":"INVALID_PARAMETER_VALUE","message":"Model serving endpoint bge-large-en configured with improper input: {\\"error_code\\": \\"BAD_REQUEST\\", \\"message\\": \\"Failed to enforce schema of data \' 0\\\\n0 Welcome to databricks vector search\' with schema \'[\'input\': string (required)]\'. Error: Model is missing inputs [\'input\']. Note that there were extra inputs: [0]\\"}"}', status_code 400

----

My code that calls this method is:

if not index_exists(vsc, VECTOR_SEARCH_ENDPOINT_NAME, vs_index_fullname):

print(f"Creating index {vs_index_fullname} on endpoint {VECTOR_SEARCH_ENDPOINT_NAME}...")

vsc.create_delta_sync_index(

endpoint_name=VECTOR_SEARCH_ENDPOINT_NAME,

index_name=vs_index_fullname,

source_table_name=source_table_fullname,

pipeline_type="TRIGGERED",

primary_key="id",

embedding_source_column='content', #The column containing our text

embedding_model_endpoint_name='bge-large-en'

#embedding_model_endpoint_name='gte_large'

)

I have tried changing to a different embedding model (GTE_LARGE), but still getting the above error.

I guess there is a incompatibilty between the input schema of the embedding model and the schema expected by the vector search endpoint.

Has any of you encountered this problem? I would appreciate if you could give me a hint on how to solve it using an embedded model from Databricks Marketplace.

Thanks !

Re: Dbdemo: LLM Chatbot With Retrieval Augmented Generation (RAG)

cmunteanu — Tue, 26 Mar 2024 13:33:22 GMT

Hello @Retired_mod , thanks a lot for the information you provided. Anyhow, I have managed a workaround, by pre-computing the embeddings for each chunk. I have created an embedding column on the source table and used this column as input to the create_delta_sync_index method.

That is: substitute parameter embedding_source_column='content' for:

embedding_dimension=1024,

embedding_vector_column="embedding"

and the syncronization of the index with the source table worked just fine.

Re: Dbdemo: LLM Chatbot With Retrieval Augmented Generation (RAG)

jbellidocaceres — Fri, 24 May 2024 07:13:00 GMT

Hi @Retired_mod and @cmunteanu , I am having exactly the same problem to create the vector index and it seems that there could be a bug in the demo. What confuses me is that and even when using the Databricks UI, I can not manage to create the vector index.

Well, when running the demo, it stays for a long time repeating:

============

Waiting for index to be ready, this can take a few min... {'detailed_state': 'PROVISIONING_INITIAL_SNAPSHOT', 'message': 'Index is currently is in the process of syncing initial data. Check latest status: https://adb-393322312342211.5.azuredatabricks.net/explore/data/dev_talk/llm_rag/databricks_documentation_vs_index', 'indexed_row_count': 0, 'provisioning_status': {'initial_pipeline_sync_progress': {'latest_version_currently_processing': 1, 'num_synced_rows': 0, 'total_rows_to_sync': 14129, 'sync_progress_completion': 0.0, 'pipeline_metrics': {'total_sync_time_per_row_ms': 0.0, 'ingestion_metrics': {'ingestion_time_per_row_ms': 0.0, 'ingestion_batch_size': 300}, 'embedding_metrics': {'embedding_generation_time_per_row_ms': 0.0, 'embedding_generation_batch_size': 0}}}}, 'ready': False, 'index_url': 'adb-393322312342211.5.azuredatabricks.net/api/2.0/vector-search/endpoints/dbdemos_vs_endpoint/indexes/dev_talk.llm_rag.databricks_documentation_vs_index'} - pipeline url:adb-393322312342211.5.azuredatabricks.net/api/2.0/vector-search/endpoints/dbdemos_vs_endpoint/indexes/dev_talk.llm_rag.databricks_documentation_vs_index

Then after a long time the Cell stops with the following error message:

"HTTPError: 400 Client Error: Bad Request for url: https://australiaeast.azuredatabricks.net/api/2.0/vector-search/endpoints/dev_talk_desk.llm_rag.databricks_documentation_vs_index/indexes/dbdemos_vs_endpoint"

It seems that the url is wrong (this is the bug I was referring), it has the endpoint and the vector index path interchanged. It should be:

"https://australiaeast.azuredatabricks.net/api/2.0/vector-search/endpoints/dbdemos_vs_endpoint /indexes/dev_talk_desk.llm_rag.databricks_documentation_vs_index

dbdemos_vs_endpoint"

Just like in the output of the Cell that is showing above. There, the URL is showed correctly,

================

@Retired_mod If any specific configuration is required regarding the embedding model, it would be good to have it specified. In your reply you said:

When creating the Vector Search Index, ensure that you specify the correct parameters:

embedding_source_column: This should match the column name containing your text data (e.g., ‘content’).
embedding_model_endpoint_name: Use ‘bge-large-en’ as you’ve set up this model as a serving endpoint.

All these specifications are correctly configured in the demo notebook. So, I am confused on what is left for us to configure.

@cmunteanu I have followed your suggestion of using a self managed embedding to create the vector index. It does work, in the sense that I created the vector index. But, I can not use (easily) the nice features of Databricks vector_search client that converts internally text to vectors and vice-versa. Which make things easier for the RAG - chatbot. Have you got around that?