Databricks Community

RodrigoE · ‎12-23-2025

Hello,

I have created a vector search index for a delta table with 1,400 rows. Using this vector index to find matching records on a table with 52M records with the query below ran for 20hrs and failed with: 'HTTP request failed with status: {"error_code":403,"message":"Invalid access token. Do you have any suggestions to improve performance?

Thank you!

Query:

SELECT

columns...

FROM

delta table,

LATERAL (

SELECT *

FROM

VECTOR_SEARCH(

index => "my vector index",

query_text => column from delta table,

query_type => 'ANN',

num_results => 1

)

emma_s · ‎12-24-2025

By using the lateral and joining to your table you are exploding out the data to a point it will struggle to handle it. I would consider breaking your table and vector search index in into managable chunks and processing in stages. Usually you know you don't need to join everything to everything to search. You may also want to consider not doing it in SQL.

iyashk-DB · ‎12-24-2025

Hi @RodrigoE ,
Your LATERAL subquery calls the Vector Search function once for every row of the 52M-row table, which results in tens of millions of remote calls to the Vector Search endpoint—this is not a nice pattern and will be extremely slow leading to token expiry and the error that you are encountering.

If you are trying to do something like "for each of the 1,400 query rows, find the best match among the 52M target rows", my suggestion is to invert the index and the other table. Create the vector index on the large table and drive the query from the small table. That reduces calls from ~52M to ~1,400. This pattern matches the intended “multiple terms at the same time using LATERAL” examples shown in the docs, where the outer (driver) table is small.
Ref Doc - https://docs.databricks.com/aws/en/sql/language-manual/functions/vector_search

But if you want to do what you are already doing - "assign each of the 52M rows the best match among the 1,400-row set", then Vector Search might not be the right way. Compute embeddings for both tables and perform a Spark-side nearest-prototype assignment, broadcasting the 1,400 embeddings and computing the argmax similarity per row. This avoids 52M remote queries and keeps compute in Spark. Or invert the problem: index the large table, issue 1,400 Vector Search queries, collect top-1 matches, and post-process if that satisfies your use case.

Databricks Community

Vector search index very slow

📌‌ Complete Your Profile – Help Others Get to Know You

The Next Wave of Enterprise AI | Webinar

Build apps without jumping through hoops

Solution Accelerator Series | Identify Fraud With Geospatial Analytics and AI

🌟 Community Pulse: Your Weekly Roundup! June 29 – July 05, 2026