cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Data Encryption in DLT

Gilg
Contributor II

Hi Team,

We have a requirement to Encrypt PII data in Silver layer. What is the best way to implement this in DLT? and only users that has security privileges are able to decrypt the PII info.

I have done this in the past using Structured Streaming but not in DLT, so that's my other option. 

Cheers,

G

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @Gilg , 

To securely encrypt PII data in Databricks Delta Lake:

  1. Use a trusted Key Management Service (KMS) to store encryption keys.

  2. Create encryption and decryption functions in PySpark with KMS integration.

  3. Identify and encrypt PII data using PySpark, then store it in Delta Lake.

  4. Control access to decryption keys and KMS, allowing only authorized users.

  5. For those experienced with Structured Streaming, apply a similar approach.

This ensures strict security for your PII data while adhering to compliance standards.

Gilg
Contributor II

Can you show me how to use the functions built in pyspark using DLT please.

Also, trying to implement column/row level security in silver tables that is generated by DLT, but giving me the following error

[RequestId=35024c5d-ad05-4f68-a4cb-f3a723f66e1c ErrorClass=INVALID_PARAMETER_VALUE.ROW_COLUMN_ACCESS_POLICIES_UNSUPPORTED_ON_VIEWS] Cannot set row and column access policies on views

Looks to me it is not supported; can you suggest other ways please.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.