Databricks Community

cmunteanu · ‎03-22-2024

Hello All,

I am trying to follow the dbdemo called 'llm-rag-chatbot' available at the following link. The setup works Ok, and I have imported from the Databricks Marketplace an embedding model that is:

bge_large_en_v1_5

Running the notebook called: 01-Data-Preparation-and-Index I am stuck with an error when trying to create a Vector Search Index with Managed Embeddings and the BGE model that I have setup as a serving endpoint, previously. More specifically, the Vector Search endpoint provisions succesfully, but when executing the index creation and syncronization method: create_delta_sync_index, I get the following error:

----

Exception: Response content b'{"error_code":"INVALID_PARAMETER_VALUE","message":"Model serving endpoint bge-large-en configured with improper input: {\\"error_code\\": \\"BAD_REQUEST\\", \\"message\\": \\"Failed to enforce schema of data \' 0\\\\n0 Welcome to databricks vector search\' with schema \'[\'input\': string (required)]\'. Error: Model is missing inputs [\'input\']. Note that there were extra inputs: [0]\\"}"}', status_code 400

----

My code that calls this method is:

if not index_exists(vsc, VECTOR_SEARCH_ENDPOINT_NAME, vs_index_fullname):

print(f"Creating index {vs_index_fullname} on endpoint {VECTOR_SEARCH_ENDPOINT_NAME}...")

vsc.create_delta_sync_index(

endpoint_name=VECTOR_SEARCH_ENDPOINT_NAME,

index_name=vs_index_fullname,

source_table_name=source_table_fullname,

pipeline_type="TRIGGERED",

primary_key="id",

embedding_source_column='content', #The column containing our text

embedding_model_endpoint_name='bge-large-en'

#embedding_model_endpoint_name='gte_large'

)

I have tried changing to a different embedding model (GTE_LARGE), but still getting the above error.

I guess there is a incompatibilty between the input schema of the embedding model and the schema expected by the vector search endpoint.

Has any of you encountered this problem? I would appreciate if you could give me a hint on how to solve it using an embedded model from Databricks Marketplace.

Thanks !

cmunteanu · ‎03-26-2024

Hello @Retired_mod , thanks a lot for the information you provided. Anyhow, I have managed a workaround, by pre-computing the embeddings for each chunk. I have created an embedding column on the source table and used this column as input to the create_delta_sync_index method.

That is: substitute parameter embedding_source_column='content' for:

embedding_dimension=1024,

embedding_vector_column="embedding"

and the syncronization of the index with the source table worked just fine.

View solution in original post

cmunteanu · ‎03-26-2024

Hello @Retired_mod , thanks a lot for the information you provided. Anyhow, I have managed a workaround, by pre-computing the embeddings for each chunk. I have created an embedding column on the source table and used this column as input to the create_delta_sync_index method.

That is: substitute parameter embedding_source_column='content' for:

embedding_dimension=1024,

embedding_vector_column="embedding"

and the syncronization of the index with the source table worked just fine.

jbellidocaceres · ‎05-24-2024

Hi @Retired_mod and @cmunteanu , I am having exactly the same problem to create the vector index and it seems that there could be a bug in the demo. What confuses me is that and even when using the Databricks UI, I can not manage to create the vector index.

Well, when running the demo, it stays for a long time repeating:

============

Waiting for index to be ready, this can take a few min... {'detailed_state': 'PROVISIONING_INITIAL_SNAPSHOT', 'message': 'Index is currently is in the process of syncing initial data. Check latest status: https://adb-393322312342211.5.azuredatabricks.net/explore/data/dev_talk/llm_rag/databricks_documenta...', 'indexed_row_count': 0, 'provisioning_status': {'initial_pipeline_sync_progress': {'latest_version_currently_processing': 1, 'num_synced_rows': 0, 'total_rows_to_sync': 14129, 'sync_progress_completion': 0.0, 'pipeline_metrics': {'total_sync_time_per_row_ms': 0.0, 'ingestion_metrics': {'ingestion_time_per_row_ms': 0.0, 'ingestion_batch_size': 300}, 'embedding_metrics': {'embedding_generation_time_per_row_ms': 0.0, 'embedding_generation_batch_size': 0}}}}, 'ready': False, 'index_url': 'adb-393322312342211.5.azuredatabricks.net/api/2.0/vector-search/endpoints/dbdemos_vs_endpoint/indexes/dev_talk.llm_rag.databricks_documentation_vs_index'} - pipeline url:adb-393322312342211.5.azuredatabricks.net/api/2.0/vector-search/endpoints/dbdemos_vs_endpoint/indexes/dev_talk.llm_rag.databricks_documentation_vs_index

Then after a long time the Cell stops with the following error message:

"HTTPError: 400 Client Error: Bad Request for url: https://australiaeast.azuredatabricks.net/api/2.0/vector-search/endpoints/dev_talk_desk.llm_rag.data..."

It seems that the url is wrong (this is the bug I was referring), it has the endpoint and the vector index path interchanged. It should be:

"https://australiaeast.azuredatabricks.net/api/2.0/vector-search/endpoints/dbdemos_vs_endpoint /indexes/dev_talk_desk.llm_rag.databricks_documentation_vs_index

dbdemos_vs_endpoint"

Just like in the output of the Cell that is showing above. There, the URL is showed correctly,

================

@Retired_mod If any specific configuration is required regarding the embedding model, it would be good to have it specified. In your reply you said:

When creating the Vector Search Index, ensure that you specify the correct parameters:

embedding_source_column: This should match the column name containing your text data (e.g., ‘content’).
embedding_model_endpoint_name: Use ‘bge-large-en’ as you’ve set up this model as a serving endpoint.

All these specifications are correctly configured in the demo notebook. So, I am confused on what is left for us to configure.

@cmunteanu I have followed your suggestion of using a self managed embedding to create the vector index. It does work, in the sense that I created the vector index. But, I can not use (easily) the nice features of Databricks vector_search client that converts internally text to vectors and vice-versa. Which make things easier for the RAG - chatbot. Have you got around that?

Databricks Community

Dbdemo: LLM Chatbot With Retrieval Augmented Generation (RAG)

Join Us as a Local Community Builder!

🚀 Announcing the Databricks Data Intelligence Platform Cheat Sheet

Find Sensitive Data at Scale with Data Classification in Unity Catalog

Solution Accelerator Series | #6 - Adverse Drug Event Detection

Announcing Backfill Runs in Lakeflow Jobs for Higher Quality Downstream Data

🚀 New: Databricks Interactive Architecture Design Workshops