cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

Help with hybrid search function

the_peterlandis
New Contributor II

I need help writing a filter.  I want to pre-filter a vector index before performing a hybrid search and create this as a function.  Below is a simple example of searching for products for a given customer.  A prefilter is key as this provides authorizations for searching a vector index before applying the top k which reduces the vector space searching as a prefilter before the search.  I am not seeing any filter capability like how you would call the API.
Example of the search API with prefilter

results = index.similarity_search(
   query_text=question,
   query_type = "HYBRID",
   columns=["content", "product", "product_description", "product_id", "purchase_date"],
   filters="{"customer_emai":customer_email},
   num_results=5
)

Below is the SQL Function I need help on

%sql
CREATE OR REPLACE FUNCTION get_customer_products(
search_query string comment 'Search query',
customer_email string comment 'Customer email',
)
RETURNS TABLE (
product STRING,
product_description STRING,
product_id int,
purchase_date_timestamp timestamp
)
COMMENT 'Retrieve the customer products'
LANGUAGE SQL
RETURN (
SELECT
content,
product_id,
purchase_date_timestamp
from
vector_search(
index=> 'gold.my_store.products',
query => search_query,
query_type => 'HYBRID',
filter => '{"customer_email": "' || customer_email || '"}',
num_results => 5
)
)
2 REPLIES 2

the_peterlandis
New Contributor II

Based on this documentation, it says, it indicate sql function VECTOR_SEARCH cannot apply pre filter which prefilter is a fundamental capability for vector search.  Just very surprised this is not supported.

KaushalVachhani
Databricks Employee
Databricks Employee

@the_peterlandis, Yes, currently vector_search SQL function doesn't provide pre filter support. However, if you must implement the UC function for this, you can do it something like below using Python code with filters.

 

%sql
CREATE OR REPLACE FUNCTION kaushal.kaushal.vector_similarity_search(
  query_text STRING,
  filter_id INT,
  num_results INT
)
RETURNS STRING
LANGUAGE PYTHON
COMMENT "Vector similarity search using authenticated client"
ENVIRONMENT (
  dependencies = '["databricks-vectorsearch", "databricks-sdk"]',
  environment_version = 'None'
)
AS $$
import json
import os

# Get credentials from Databricks secrets
# You'll need to set these up first
def get_secret(scope, key):
    from databricks.sdk import WorkspaceClient
    w = WorkspaceClient()
    return w.secrets.get_secret(scope=scope, key=key).value

# Alternative: if dbutils is available in UDF context
# token = dbutils.secrets.get(scope="your-scope", key="databricks-token")
# host = dbutils.secrets.get(scope="your-scope", key="databricks-host")

# Set environment variables for authentication
os.environ['DATABRICKS_HOST'] = get_secret("your-scope", "databricks-host")
os.environ['DATABRICKS_TOKEN'] = get_secret("your-scope", "databricks-token")

from databricks.vector_search.client import VectorSearchClient

# Initialize client - should now pick up environment variables
client = VectorSearchClient()
index = client.get_index(
    endpoint_name="vector-search-demo-endpoint-kaushal",
    index_name="kaushal.kaushal.my_text_data_index"
)

results = index.similarity_search(
    query_text=query_text,
    columns=["id", "content"],
    filters={"id": [filter_id]},
    num_results=num_results
)

return json.dumps(results.get('result', {}).get('data_array', []))
$$;


Then run your UC function with SQL, and you should get the expected results. 

 

KaushalVachhani_0-1763803856179.png

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now