I was wondering if there was a roadmap for the development of the vector_search function: vector_search function | Databricks Documentation
Specifically, I was wondering if / when the following limitations may be lifted:
- Querying DIRECT_ACCESS index types are not supported.
- Input parameters filters_json are not supported.
- Hybrid keyword-similarity search is not supported using vector_search().
We are currently developing a solution which requires the use of a direct access vector store and batch similarity searches, in which we require the use of filters and preferably hybrid search (though hybrid search is more of a nice to have for now).
Currently our solution uses an async function to do this, but this takes a while:
from databricks.vector_search.client import VectorSearchClient
index.similarity_search(
query_text= q,
num_results= k,
query_vector= embedded_query,
filters= {"document_id": document_id_filter},
query_type= "hybrid"
)
It would be great to know when we may be able to switch to using Databricks SQL vector_search? We have noticed significant performance increases from switching to ai_query and it would also be nice to keep everything in spark dataframe format throughout the process.
Thanks!