cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Encryption/Decryption options in ADB

Anonymous
Not applicable

Hello all,

We are working on one of the client requirements to implement suitable data encryption in Azure Databricks.

We should be able to encrypt and decrypt the data based on the access, we explored fernet library but client denied it saying it degrades the performance.

We also explored aes_encryption functions but looking for multiple better options. Please suggest if anyone implemented this capability, appreciate the quick suggestions.  

Thanks,

Porus

5 REPLIES 5

-werners-
Esteemed Contributor III

do you mean data encryption while databricks is processing data? Because the data sitting on a data lake is already encrypted (and can be double encrypted for sensitive data)

Anonymous
Not applicable

yes we need to encrypt the sensitive data and store. Only designated users will have the provision to read the decrypted data.

Hubert-Dudek
Esteemed Contributor III
  • I would create SQL UDF function with aes decrypt / encrypt
  • Unity catalog is coming to GA, and supports for managing access will be improved there. I haven't tested it in practice, but it should be possible to set access right for columns and some function per column.

-werners-
Esteemed Contributor III

That was my first thought too, but OP mentions they already explored fernet, which was not performant enough. Fernet applies AES 128 in CBC mode, with a SHA256 HMAC message authentication code so aes_encrypt/decrypt will probably be comparable in performance (depending on key strength). (AES is hardware accelerated on most x86 cpus so it should already be very fast)

But frankly: any cryptographic operation will take cpu, so for me it is not clear what the client expects.

PS. +1 for Unity Catalog

Vidula
Honored Contributor

Hi @purushotham Chanda​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.