cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Evaluating View-Based Access Control vs. Native Databricks Security Features

jackgurae
New Contributor II

Hello Databricks Community,

I'm seeking advice and insights on best practices for managing data access and permissions in Databricks. Our company currently uses a view-based approach for access control, but I'm wondering if we should transition to Databricks' native security features like row-level security (RLS) and column masks.

Our Current Approach
We create multiple view tables from upstream tables to manage permissions. For example:
- `client_info` (upstream table)
- `npii_client_info`
- `pii_client_info`
- `pii_client_info_teamA`

Concerns
1. Data lineage complexity??
2. Maintenance overhead (multiple data dictionaries)
3. Potential inconsistencies across views
4. Scalability as data and organization grow

1. What are the pros and cons of our view-based approach vs. using Databricks' RLS and column masks?
2. Has anyone successfully transitioned from a view-based system to native Databricks security features? What challenges did you face?
3. Are there specific use cases where view-based access control might be preferable?
4. How does the performance compare between these two approaches, especially for large datasets?
5. What impact does each approach have on data governance and compliance efforts?

I'd greatly appreciate any insights, experiences, or best practices you can share. Thank you in advance for your help!

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @jackgurae, Managing data access and permissions in Databricks involves evaluating two main approaches: a view-based method versus leveraging Databricks' native security features.

The view-based approach offers simplicity by abstracting complex logic and enabling fine-grained control over access through different views tailored for user groups or roles. However, it introduces challenges such as intricate data lineage management, maintenance overhead with separate data dictionaries, consistency issues, and scalability concerns as data and user bases expand.

In contrast, Databricks' Row-Level Security (RLS) and Column Masks provide finer control with policies restricting row-level access based on user attributes and dynamic data redaction based on user identity. This native integration enhances data governance and compliance efforts, albeit requiring a learning curve and being limited to Shared Clusters.

Thank you @Kaniz_Fatma for your contribution.

I want to learn more about real use cases when company decided to go with Databricks' Row-Level Security (RLS) and Column Masks. I just want to convince teams to consider the approach.
To me the pros outweigh the cons, but i want to know what angle I might miss.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group