topic Vector search index creation is incredibly slow in Generative AI

Vector search index creation is incredibly slow

epistoteles — Wed, 05 Mar 2025 15:15:09 GMT

I am trying to create a vector search index for a Delta Table using Azure OpenAI embeddings (text-embedding-3-large). The table contains 5000 chunks with approx. 1000 tokens each. The OpenAI embeddings are generated through a Databricks model serving endpoint which forwards the embedding requests to our Azure deployment.

The latency of the index creation is incredibly high. To embed just 5000 chunks the initial sync takes over an hour. If I extrapolate this to 5 Mio. chunks, embedding them would take over a month.

The deployment in Azure can process 350K tokens/min and should not be the limiting factor.

I am currently creating the index like this:

vsc.create_delta_sync_index(
    endpoint_name=VECTOR_SEARCH_ENDPOINT_NAME,
    index_name=vs_index_fullname,
    source_table_name=source_table_fullname,
    pipeline_type="TRIGGERED",
    primary_key="id",
    embedding_source_column='content'
    embedding_model_endpoint_name='text-embedding-3-large'
  )

My assumption for the slow speed is that the index calculates the embeddings row by row, without async Azure API calls, and only starts with the next row once the previous embedding is created.

Which options do I have for speeding up the index creation?

Re: Vector search index creation is incredibly slow

amitkumarvish — Tue, 10 Jun 2025 04:20:59 GMT

@epistoteles I am also facing the same issue while delta table sync with Index. Could you please share if you have any work around or used some different appoarch to decrease latency.

Thanks in advance.

Re: Vector search index creation is incredibly slow

epistoteles — Tue, 10 Jun 2025 07:45:57 GMT

@amitkumarvish My only solution so far has been to increase embedding model capacity on Azure ... still comparably slow.

Re: Vector search index creation is incredibly slow

AbhaySingh — Wed, 19 Nov 2025 12:57:42 GMT

Is following an option for you?

https://www.databricks.com/blog/announcing-storage-optimized-endpoints-vector-search

Also, good to have your SA address this further as underlying issue may be somewhere else - model serving performance for instance.