Imagine you're a global retail company with customer order data spanning multiple regions—North America, Europe, and Asia Pacific. You need to share this data with regional sales partners, however each partner should only see data from their own region.
You could maintain separate tables for each region, but that's inefficient and creates a maintenance nightmare. Or you could share everything and trust partners to filter correctly—but that violates data privacy principles and compliance requirements.
Fortunately, Delta Sharing offers two powerful patterns for fine-grained access control:
Let's explore both approaches, understand when to use each, and see them in action.
Dynamic views use the current_recipient() function to filter data based on who is accessing it. The same view can return different data to different recipients—all without creating multiple copies of your data.
Here's the architecture:
Step 1: Create the source table (Provider)
|
CREATE TABLE global_retail.sales.customer_ |
Step 2: Create a dynamic view with recipient-based filtering (Provider)
|
CREATE OR REPLACE VIEW global_retail.sales.regional_ |
The magic happens in the WHERE clause — CURRENT_RECIPIENT('region') dynamically resolves to each recipient's region property.
Step 3: Create a single share for all recipients (Provider)
|
CREATE SHARE IF NOT EXISTS regional_orders_dynamic |
|
ALTER SHARE regional_orders_dynamic |
Step 4: Create recipients with custom properties (Provider)
|
CREATE RECIPIENT IF NOT EXISTS us_partner |
|
-- Set the region property for this recipient |
Step 5: Grant access (one grant for all) (Provider)
|
GRANT SELECT ON SHARE regional_orders_dynamic TO RECIPIENT us_partner; |
Alternative: Using Built-in Recipient Name
You can also leverage the built-in databricks.name property instead of custom properties:
|
CREATE OR REPLACE VIEW global_retail.sales.regional_ |
This approach provides more explicit control over the mapping between recipients and data.
Implementation: Recipient Side
Recipients mount the shared catalog and query normally—the filtering happens transparently:
|
-- Mount the share as a catalog |
|
-- Query the view - you only see your region's data |
As the us_partner recipient, this query returns only United States orders. The EU partner querying the exact same view sees only European orders. It's the same share, the same view, but personalized data.
Dynamic views can also be used to implement column-level security, such as obfuscation or redaction of sensitive data. In this case you would use the CURRENT_RECIPIENT() function in the column list as opposed to the WHERE clause as shown in the following example:
|
CREATE VIEW sensitive_data_view AS |
Different recipients see different levels of detail based on their access_level property.
Column level security can be combined with row level filtering using Dynamic Views as well.
1. Provider-Side Compute Costs
Dynamic views execute on the provider's compute when recipients query them. This means:
For high-volume scenarios, this can be expensive, consider partition filtering (discussed next) to avoid this entirely.
2. Recipient-Only Views
Dynamic views using current_recipient() are recipient-only—you cannot query them locally in the provider workspace:
|
-- This will FAIL in the provider workspace |
Create separate local views for testing/validation.
3. Property Management
Recipient properties must be managed carefully:
|
-- View the properties for a recipient |
|
-- Update properties as needed |
Properties are case-sensitive and must exactly match what your view logic expects.
Partition filtering leverages Delta Lake's partitioning capabilities to share specific partitions of a table with specific recipients. This provides storage-level isolation—recipients literally cannot access partitions they're not authorized for.
Here's the architecture:
Step 1: Create a partitioned table (Provider)
|
CREATE TABLE global_retail.sales.orders_by_ |
Step 2: Create a share with partition-specific access (Provider)
|
CREATE SHARE IF NOT EXISTS regional_analytics_share |
|
-- Share only the partition matching recipient's region property |
The key difference: you're sharing the table but restricting access to specific partitions using CURRENT_RECIPIENT().region property.
Step 3: Set up recipients with region properties (Provider)
|
CREATE RECIPIENT IF NOT EXISTS us_partner ALTER RECIPIENT us_partner SET PROPERTIES ('region' = 'United States'); |
Step 4: Grant access (Provider)
|
GRANT SELECT ON SHARE regional_analytics_share TO RECIPIENT us_partner; |
Implementation: Recipient Side
|
-- Mount the share |
|
-- Query the table - only your partition is accessible |
Recipients query a table (not a view), but they can only access their authorized partition. Attempting to query other partitions results in no data—those partitions simply don't exist from the recipient's perspective.
Similar to Dynamic Views properties must be managed at the Provider level, however unlike Dynamic Views, the compute costs are borne by the recipient (not the provider). Some additional considerations for Partition Filtering are listed below:
1. Partition Definitions
Partition definitions need to be made according to business rules - as opposed to typical partitioning which is usually based upon access patterns or performance - such as partitioning on date for voluminous, eventful data.
2. Column Level Security is not Supported
Query time column level redaction or obfuscation is not supported with Partition Filtering, if this is required consider Dynamic Views.
|
Factor |
Dynamic Views |
Partition Filtering |
|
Granularity |
Row-level (any filter logic) |
Partition-level only |
|
Flexibility |
High - supports complex WHERE clauses, column redaction, CASE statements |
Low - limited to partition columns |
|
Performance |
Good for moderate datasets; filters at query time |
Excellent - storage-level filtering |
|
Provider Compute |
Uses provider compute when queried by recipient |
No provider compute required |
|
Cardinality |
Works with high cardinality (millions of distinct values) |
Best for low cardinality (e.g., regions, years) |
|
Column Redaction |
Supports column masking/obfuscation |
Not supported |
|
Data Skipping |
Limited (predicate pushdown where possible) |
Excellent (partition pruning) |
|
Management |
Single share for all recipients |
Can use single share with partition filtering |
|
Transparency |
Recipients see filtered results |
Recipients see complete partition(s) |
Use Dynamic Views When:
Use Partition Filtering When:
Grant only the minimum access necessary for each recipient's business requirements. Start with restrictive filters and expand access only when justified. Regularly review recipient properties and share memberships to ensure they remain appropriate as business relationships evolve.
|
-- Show grants to understand recipient access -- Show grants to a recipient |
Track who accesses what using Databricks system tables. The system.access.audit table captures Delta Sharing events including share access, credential requests, and data retrieval operations.
|
-- Query audit logs for sharing activity |
Before going live, create test recipients and verify filtering behaves as expected:
|
-- Create a test recipient with specific properties |
Mount the share in a test environment and validate that:
Maintain clear documentation of required properties and their valid values:
|
Property |
Type |
Valid Values |
Description |
|
region |
STRING |
United States, Europe, Asia Pacific |
Geographic data boundary |
|
access_level |
STRING |
full, basic |
Determines column visibility |
|
tier |
STRING |
premium, standard |
Service tier for feature access |
Use view versioning when making changes to avoid disrupting active recipients:
|
-- Create new version with updated logic |
|
-- Test the new view before swapping |
Fine-grained access control is essential for secure data sharing in multi-tenant environments. Databricks Delta Sharing provides two powerful patterns:
Dynamic Views offer maximum flexibility with logic-based filtering and column redaction—perfect for complex access requirements, though they incur provider-side compute costs.
Partition Filtering delivers optimal performance with storage-level isolation—ideal when access boundaries align with partition columns and you want to eliminate provider compute costs.
Choose the pattern that fits your use case, or combine both for defense-in-depth security. Either way, you can confidently share data while maintaining strict privacy boundaries.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.