cancel
Showing results for 
Search instead for 
Did you mean: 
Technical Blog
Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
cancel
Showing results for 
Search instead for 
Did you mean: 
jeffreyaven
Databricks Employee
Databricks Employee

The Challenge: Sharing Data While Maintaining Privacy Boundaries

Imagine you're a global retail company with customer order data spanning multiple regions—North America, Europe, and Asia Pacific. You need to share this data with regional sales partners, however each partner should only see data from their own region.

You could maintain separate tables for each region, but that's inefficient and creates a maintenance nightmare. Or you could share everything and trust partners to filter correctly—but that violates data privacy principles and compliance requirements.

Fortunately, Delta Sharing offers two powerful patterns for fine-grained access control:

  • Dynamic Views - Logic-based row-level security
  • Partition Filtering - Storage-level partition isolation

Let's explore both approaches, understand when to use each, and see them in action.

Pattern 1: Dynamic Views with Row-Level Security

How It Works

Dynamic views use the current_recipient() function to filter data based on who is accessing it. The same view can return different data to different recipients—all without creating multiple copies of your data.

Here's the architecture:

dynamic-view-blog-image.png

Implementation Steps

Step 1: Create the source table (Provider)

CREATE TABLE global_retail.sales.customer_orders (
order_id STRING,
customer_name STRING,
email STRING,
region STRING,
product STRING,
amount DECIMAL(10,2),
order_date DATE
);

Step 2: Create a dynamic view with recipient-based filtering (Provider)

CREATE OR REPLACE VIEW global_retail.sales.regional_orders_view AS
SELECT
order_id,
customer_name,
email,
region,
product,
amount,
order_date
FROM global_retail.sales.customer_orders
WHERE region = CURRENT_RECIPIENT('region');

The magic happens in the WHERE clause — CURRENT_RECIPIENT('region') dynamically resolves to each recipient's region property.

Step 3: Create a single share for all recipients (Provider)

CREATE SHARE IF NOT EXISTS regional_orders_dynamic
COMMENT 'Orders with dynamic view filtering';

 

ALTER SHARE regional_orders_dynamic
ADD VIEW global_retail.sales.regional_orders_view;

Step 4: Create recipients with custom properties (Provider)

CREATE RECIPIENT IF NOT EXISTS us_partner
USING ID 'aws:us-west-2:adc48801-0d35-4e94-9cbc-0ff7f29ce611' -- for D2D sharing
COMMENT 'United States regional partner';

 

-- Set the region property for this recipient
ALTER RECIPIENT us_partner SET PROPERTIES ('region' = 'US');
-- Repeat this for EU and APAC partners with their respective regions

Step 5: Grant access (one grant for all) (Provider)

GRANT SELECT ON SHARE regional_orders_dynamic TO RECIPIENT us_partner;
-- Repeat for other recipients

Alternative: Using Built-in Recipient Name

You can also leverage the built-in databricks.name  property instead of custom properties:

CREATE OR REPLACE VIEW global_retail.sales.regional_orders_view AS
SELECT * FROM global_retail.sales.customer_orders
WHERE
CASE
WHEN current_recipient('databricks.name') = 'us_partner' THEN region = 'US'
WHEN current_recipient('databricks.name') = 'eu_partner' THEN region = 'EU'
WHEN current_recipient('databricks.name') = 'ap_partner' THEN region = 'AP'
END;

This approach provides more explicit control over the mapping between recipients and data.

Implementation: Recipient Side

Recipients mount the shared catalog and query normally—the filtering happens transparently:

-- Mount the share as a catalog
CREATE CATALOG IF NOT EXISTS shared_orders_dynamic
USING SHARE acme_corp.regional_orders_dynamic;

 

-- Query the view - you only see your region's data
SELECT * FROM shared_orders_dynamic.sales.regional_orders_view;

As the us_partner recipient, this query returns only United States orders. The EU partner querying the exact same view sees only European orders. It's the same share, the same view, but personalized data.

Column Redaction with Dynamic Views

Dynamic views can also be used to implement column-level security, such as obfuscation or redaction of sensitive data.  In this case you would use the CURRENT_RECIPIENT() function in the column list as opposed to the WHERE clause as shown in the following example:

CREATE VIEW sensitive_data_view AS
SELECT
order_id,
CASE
WHEN current_recipient('access_level') = 'full' THEN customer_name
ELSE '***REDACTED***'
END AS customer_name,
CASE
WHEN current_recipient('access_level') = 'full' THEN email
ELSE CONCAT('***@'SPLIT(email, '@')[1])
END AS email,
amount,
order_date
FROM customer_orders;


Different recipients see different levels of detail based on their access_level property.

Column level security can be combined with row level filtering using Dynamic Views as well.

⚠️ Important Considerations

1. Provider-Side Compute Costs

Dynamic views execute on the provider's compute when recipients query them. This means:

  • Provider bears the compute costs
  • Provider's cluster must be running
  • Performance depends on provider resources

For high-volume scenarios, this can be expensive, consider partition filtering (discussed next) to avoid this entirely.

2. Recipient-Only Views

Dynamic views using current_recipient() are recipient-only—you cannot query them locally in the provider workspace:

-- This will FAIL in the provider workspace
SELECT * FROM regional_orders_view;
-- Error: current_recipient() only works in Delta Sharing context

Create separate local views for testing/validation.

3. Property Management

Recipient properties must be managed carefully:

-- View the properties for a recipient
DESCRIBE RECIPIENT us_partner;

 

-- Update properties as needed
ALTER RECIPIENT us_partner SET PROPERTIES ('region' = 'United States''tier' = 'premium');

Properties are case-sensitive and must exactly match what your view logic expects.

Pattern 2: Partition Filtering with Storage-Level Isolation

How It Works

Partition filtering leverages Delta Lake's partitioning capabilities to share specific partitions of a table with specific recipients. This provides storage-level isolation—recipients literally cannot access partitions they're not authorized for.

Here's the architecture:

partition-filtering-blog-image.png

Step 1: Create a partitioned table (Provider)

CREATE TABLE global_retail.sales.orders_by_region (
order_id STRING,
customer_name STRING,
product STRING,
amount DECIMAL(10,2),
order_date DATE,
region STRING
)
PARTITIONED BY (region);

Step 2: Create a share with partition-specific access (Provider)

CREATE SHARE IF NOT EXISTS regional_analytics_share
COMMENT 'Share using Partition Filtering';

 

-- Share only the partition matching recipient's region property
ALTER SHARE regional_analytics_share
ADD TABLE global_retail.sales.orders_by_region
PARTITION(region = CURRENT_RECIPIENT().region);

The key difference: you're sharing the table but restricting access to specific partitions using CURRENT_RECIPIENT().region property.

Step 3: Set up recipients with region properties (Provider)

CREATE RECIPIENT IF NOT EXISTS us_partner
USING ID 'aws:us-west-2:adc48801-0d35-4e94-9cbc-0ff7f29ce611' -- for D2D sharing;

ALTER RECIPIENT us_partner SET PROPERTIES ('region' = 'United States');

Step 4: Grant access (Provider)

GRANT SELECT ON SHARE regional_analytics_share TO RECIPIENT us_partner;


Implementation: Recipient Side

-- Mount the share
CREATE CATALOG IF NOT EXISTS us_orders_partition
USING SHARE acme_corp.regional_analytics_share;

 

-- Query the table - only your partition is accessible
SELECT * FROM us_orders_partition.sales.orders_by_region;

Recipients query a table (not a view), but they can only access their authorized partition. Attempting to query other partitions results in no data—those partitions simply don't exist from the recipient's perspective.

⚠️ Important Considerations

Similar to Dynamic Views properties must be managed at the Provider level, however unlike Dynamic Views, the compute costs are borne by the recipient (not the provider).  Some additional considerations for Partition Filtering are listed below:

1. Partition Definitions

Partition definitions need to be made according to business rules - as opposed to typical partitioning which is usually based upon access patterns or performance - such as partitioning on date for voluminous, eventful data.

2. Column Level Security is not Supported

Query time column level redaction or obfuscation is not supported with Partition Filtering, if this is required consider Dynamic Views.

Dynamic Views vs Partition Filtering: When to Use Each

Factor

Dynamic Views

Partition Filtering

Granularity

Row-level (any filter logic)

Partition-level only

Flexibility

High - supports complex WHERE clauses, column redaction, CASE statements

Low - limited to partition columns

Performance

Good for moderate datasets; filters at query time

Excellent - storage-level filtering

Provider Compute

Uses provider compute when queried by recipient

No provider compute required

Cardinality

Works with high cardinality (millions of distinct values)

Best for low cardinality (e.g., regions, years)

Column Redaction

Supports column masking/obfuscation

Not supported

Data Skipping

Limited (predicate pushdown where possible)

Excellent (partition pruning)

Management

Single share for all recipients

Can use single share with partition filtering

Transparency

Recipients see filtered results

Recipients see complete partition(s)

Use Dynamic Views When:

 
  • You need row-level filtering beyond partition boundaries
  • Filter criteria is complex or changes frequently
  • You want to redact/mask specific columns
  • Filtering dimension has high cardinality
  • Data volume per recipient is manageable

Use Partition Filtering When:

 
  • Access boundaries align with partition columns (region, department, etc)
  • You need maximum query performance
  • You want to avoid provider-side compute costs
  • Partitions have low cardinality (<1000 values)
  • Storage-level isolation is required for compliance

Production Best Practices

1. Apply Principle of Least Privilege

Grant only the minimum access necessary for each recipient's business requirements. Start with restrictive filters and expand access only when justified. Regularly review recipient properties and share memberships to ensure they remain appropriate as business relationships evolve.

-- Show grants to understand recipient access
SHOW GRANTS ON SHARE regional_orders_dynamic;

-- Show grants to a recipient
SHOW GRANTS TO RECIPIENT us_partner;

2. Audit and Monitor

Track who accesses what using Databricks system tables. The system.access.audit table captures Delta Sharing events including share access, credential requests, and data retrieval operations.

-- Query audit logs for sharing activity
SELECT
  event_time,
  action_name,
  request_params.recipient_name,
  request_params.share_name,
  source_ip_address
FROM system.access.audit
WHERE service_name = 'deltasharing'
  AND event_date >= current_date() - INTERVAL 7 DAYS
ORDER BY event_time DESC;

3. Test Access Controls

Before going live, create test recipients and verify filtering behaves as expected:

-- Create a test recipient with specific properties
CREATE RECIPIENT test_us USING ID 'test-metastore-id';
ALTER RECIPIENT test_us SET PROPERTIES ('region' = 'United States');
GRANT SELECT ON SHARE regional_orders_dynamic TO RECIPIENT test_us;

Mount the share in a test environment and validate that:

  • Only expected rows are visible
  • Sensitive columns are properly redacted
  • Partition pruning occurs where expected

4. Document Recipient Properties

Maintain clear documentation of required properties and their valid values:

Property

Type

Valid Values

Description

region

STRING

United StatesEuropeAsia Pacific

Geographic data boundary

access_level

STRING

fullbasic

Determines column visibility

tier

STRING

premiumstandard

Service tier for feature access

5. Version Your Views

Use view versioning when making changes to avoid disrupting active recipients:

-- Create new version with updated logic
CREATE VIEW regional_orders_view_v2 AS
SELECT ...  -- Updated logic
FROM customer_orders
WHERE region = CURRENT_RECIPIENT('region');

 

-- Test the new view before swapping
-- Then update the share
ALTER SHARE regional_orders_dynamic DROP VIEW regional_orders_view;
ALTER SHARE regional_orders_dynamic ADD VIEW regional_orders_view_v2;

Conclusion

Fine-grained access control is essential for secure data sharing in multi-tenant environments. Databricks Delta Sharing provides two powerful patterns:

Dynamic Views offer maximum flexibility with logic-based filtering and column redaction—perfect for complex access requirements, though they incur provider-side compute costs.

Partition Filtering delivers optimal performance with storage-level isolation—ideal when access boundaries align with partition columns and you want to eliminate provider compute costs.

Choose the pattern that fits your use case, or combine both for defense-in-depth security. Either way, you can confidently share data while maintaining strict privacy boundaries.