Generate embeddings for 50 million rows in dataframe

vikram_p
Databricks Partner

Hello All,

I have dataframe with 5 million rows and before we can setup vector search endpoint against index, we want to generate embeddings column for each of those rows. Please suggest whats an optimal way to do this?

We are in development phase so we need to do full load but later we will need to do same for incremental load.

Thanks & Regards,

Vikram