cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Data Encryption in DLT

Gilg
Contributor II

Hi Team,

We have a requirement to Encrypt PII data in Silver layer. What is the best way to implement this in DLT? and only users that has security privileges are able to decrypt the PII info.

I have done this in the past using Structured Streaming but not in DLT, so that's my other option. 

Cheers,

G

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @Gilg , 

To securely encrypt PII data in Databricks Delta Lake:

  1. Use a trusted Key Management Service (KMS) to store encryption keys.

  2. Create encryption and decryption functions in PySpark with KMS integration.

  3. Identify and encrypt PII data using PySpark, then store it in Delta Lake.

  4. Control access to decryption keys and KMS, allowing only authorized users.

  5. For those experienced with Structured Streaming, apply a similar approach.

This ensures strict security for your PII data while adhering to compliance standards.

Gilg
Contributor II

Can you show me how to use the functions built in pyspark using DLT please.

Also, trying to implement column/row level security in silver tables that is generated by DLT, but giving me the following error

[RequestId=35024c5d-ad05-4f68-a4cb-f3a723f66e1c ErrorClass=INVALID_PARAMETER_VALUE.ROW_COLUMN_ACCESS_POLICIES_UNSUPPORTED_ON_VIEWS] Cannot set row and column access policies on views

Looks to me it is not supported; can you suggest other ways please.