07-05-2022 09:50 AM
Hello all,
We are working on one of the client requirements to implement suitable data encryption in Azure Databricks.
We should be able to encrypt and decrypt the data based on the access, we explored fernet library but client denied it saying it degrades the performance.
We also explored aes_encryption functions but looking for multiple better options. Please suggest if anyone implemented this capability, appreciate the quick suggestions.
Thanks,
Porus
07-06-2022 03:19 AM
do you mean data encryption while databricks is processing data? Because the data sitting on a data lake is already encrypted (and can be double encrypted for sensitive data)
07-06-2022 04:10 AM
yes we need to encrypt the sensitive data and store. Only designated users will have the provision to read the decrypted data.
07-13-2022 06:13 AM
07-13-2022 06:35 AM
That was my first thought too, but OP mentions they already explored fernet, which was not performant enough. Fernet applies AES 128 in CBC mode, with a SHA256 HMAC message authentication code so aes_encrypt/decrypt will probably be comparable in performance (depending on key strength). (AES is hardware accelerated on most x86 cpus so it should already be very fast)
But frankly: any cryptographic operation will take cpu, so for me it is not clear what the client expects.
PS. +1 for Unity Catalog
08-31-2022 11:27 PM
Hi @purushotham Chanda
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group