How to perform combined search on structured and unstructured data in databrick using RAG or other

Generative AI

Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.

I created a RAG application in databricks which performs the following steps:

1. Extract text from PDF files

2. Prepare embeddings on extracted text and create vector search index

3. Create a LLM model and served the model which can answer question based on pdf data.

I followed the below URL mainly to develop the same:

https://notebooks.databricks.com/demos/llm-rag-chatbot/index.html#

Now I have some structured data in the delta tables as well. I want to perform the combined search on the pdf extracted data and the structured tables data.

I know that we can create vector search index on the structured table and use it for searching. But the problem with this approach is that I need to create a separate vector search for each table and also in vector search, the embeddings are created only on 1 column and that is used for searching via embeddings. How I can use all the columns from multiple structed tables and performed combined search via LLMs in databricks?

I also tried using Genie in databricks. I can perform search only on structured data and not on the text extracted form pdf files, so I could not use it.

I am open to use any other options also which can work in databricks.

0 REPLIES 0

Photos

Upload Upload
URL URL
Saved Photos Saved Photos

Upload location

Upload location

Add Photos to Album:

New Album

Drag here to start uploading

Drag photos here or

Tap for upload options

You must install or upgrade to the latest version of Adobe Flash Player before you can upload images.