Hi @RodrigoE ,
Your LATERAL subquery calls the Vector Search function once for every row of the 52M-row table, which results in tens of millions of remote calls to the Vector Search endpoint—this is not a nice pattern and will be extremely slow leading to token expiry and the error that you are encountering.
If you are trying to do something like "for each of the 1,400 query rows, find the best match among the 52M target rows", my suggestion is to invert the index and the other table. Create the vector index on the large table and drive the query from the small table. That reduces calls from ~52M to ~1,400. This pattern matches the intended “multiple terms at the same time using LATERAL” examples shown in the docs, where the outer (driver) table is small.
Ref Doc - https://docs.databricks.com/aws/en/sql/language-manual/functions/vector_search
But if you want to do what you are already doing - "assign each of the 52M rows the best match among the 1,400-row set", then Vector Search might not be the right way. Compute embeddings for both tables and perform a Spark-side nearest-prototype assignment, broadcasting the 1,400 embeddings and computing the argmax similarity per row. This avoids 52M remote queries and keeps compute in Spark. Or invert the problem: index the large table, issue 1,400 Vector Search queries, collect top-1 matches, and post-process if that satisfies your use case.