Encrypt and decrypt personal data with Spark Databricks.
We create a table that will include personal information. However, we want to hide personal identifiers so no one can see them.
We set a key. A key need to have 16, 24, or 32 bytes. 1 byte = 1 char.We use a widget for that. It is only for development purposes. In production, we should store that key in Key Vault.
We are inserting data into the table. Phone field we are encrypting using the aes_encrypt function. Since Databricks runtime 10.3, we can use aes_encrypt and aes_decrypt functions.
Now we can preview the data. Data without encrypting is unreadable.
We need to use the aes_decrypt function with our key when we want to read it.
๐ Please watch also my video about aes_encrypt and aes_decrypt:
โก๏ธ https://www.youtube.com/watch?v=OGLf_PiFMks
Link to GitHub with the above notebook: https://github.com/hubert-dudek/databricks-hubert/blob/main/linkedin/decrypt%20encrypt/decrypt%20enc...