cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

RBAC and VectorSearch

re
New Contributor II

When implementing the managed VectorSearch, what is the preferred way to implement row based access control? I see that you can use the filter API during a query, so simple filters using a certain column may work, but what if all the security information is in another table?

The use case in question is for a RAG workflow, but where some information should be limited based on the querying user. The filter API probably work fine for a simple "information_deprecated" flag, but probably not for checking group membership.

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @re

  • As you mentioned, the filter API allows you to apply simple filters during a query. This approach works well for scenarios where you want to restrict access based on specific column values (e.g., an โ€œinformation_deprecatedโ€ flag).
  • For instance, you can filter out documents where the โ€œinformation_deprecatedโ€ flag is set to true.
  • However, this approach might not be suitable for more complex access control requirements, such as checking group membership.
  • If your security information (e.g., group membership) resides in a separate table, youโ€™ll need a more sophisticated approach.
  • Consider creating a mapping between users and their associated groups. This mapping could be stored in a separate table or a graph structure.
  • When querying VectorSearch, use this mapping to determine which groups the querying user belongs to.
  • Then, apply appropriate filters based on group membership. For example:
    • If a user belongs to Group A, allow access to documents associated with Group A.
    • If a user belongs to Group B, restrict access to documents associated with Group B.
  • RBAC is a powerful mechanism for managing access control. It allows you to define roles and assign permissions to those roles.
  • Create roles that correspond to different levels of access (e.g., read-only, read-write, admin).
  • Assign users to specific roles based on their group memberships or other criteria.
  • During VectorSearch queries, apply filters based on the userโ€™s role. For example:
    • If a user has the โ€œread-onlyโ€ role, limit access to read-only documents.
    • If a user has the โ€œadminโ€ role, allow access to all documents.
  • Depending on your use case, consider fine-grained access control at the document level.
  • Attach metadata to each document indicating which groups or roles are allowed to access it.
  • During queries, use this metadata to filter out documents that the querying user is not authorized to see.

re
New Contributor II

Thanks AI for summarizing my question. However, you did not actually answer it.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group