09-19-2023 01:53 AM
Hi,
I am trying to set up Unity-Catalog for my company and ran into a problem today. Basically, for each new source of data we ingest, we create a view-layer on top of the "tables". We do that because we have pseudonymized information in our datalake environment, and we decrypt the information on-the-fly using views.
We organize the view-layer by putting views that belong to a source inside a database/schema. We then provide access to the whole database/schema for users that need it.
BEFORE UNITY-CATALOG:
Back then I discovered that, if you can view the metadata of VIEWS, you could see the key that was used to decrypt columns in plain-text. I could, however, restrict this by simply removing READ_METADATA permission from users. That way, users could not see the schema, history, or any other detail about the view (see screenshot depicted below).
WITH UNITY-CATALOG:
Even if I provide only USE CATALOG (on catalog level), USE SCHEMA and SELECT (on database/schema level) permission to users, they can still see the "View definition" in the Details tab of that view. This exposes the decryption key in plain-text (see screenshot depicted below).
I searched and I don't see anything like READ_METADATA permission we had before, in order to restrict this for our users in Unity-Catalog. Do you have any idea on how can I hide this information?
09-19-2023 02:24 AM
One solution I found is, creating a function which does the decryption of the column, and from the view creation, I simply call the function and pass the column.
This solution however pushes me to put the decryption key inside the function in plain-text. But, to be honest, this wouldn't be a problem since I can make this function highly secure.
Should someone else have a better solution, please feel free to share.
09-19-2023 02:24 AM
One solution I found is, creating a function which does the decryption of the column, and from the view creation, I simply call the function and pass the column.
This solution however pushes me to put the decryption key inside the function in plain-text. But, to be honest, this wouldn't be a problem since I can make this function highly secure.
Should someone else have a better solution, please feel free to share.
01-21-2025 06:03 AM
Hi @BMex , thanks for the solution. We tried a similar approach but the function needs execute permission for the view users which is making them view the definition of the functions. How did you manage to restrict it?
01-29-2025 05:57 AM
@hafeez, if you have secrets there, use the `secret(scope, secret_key)` syntax which would "hide" the actual key. Otherwise, I don't believe there is another way to "hide" the definition. Even if you put it inside a schema which they don't have access to, they still can use DatabricksSQL to query the extended definition of the function.
01-29-2025 07:34 AM
Hi @BMex thanks for the info. We did a similar approach and guess the definition cannot be hidden :).
Thanks for your reply.
01-30-2025 06:23 AM
Databricks actually has pretty good secrets management. If you need to secure a key, do this: https://learn.microsoft.com/en-us/azure/databricks/security/secrets/
To use the key in your SQL statements, do this: https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/functions/secret
You can also move the decryption to a UDF (either Python or SQL) and call that in your view, which would also keep the encryption algorithm name away from prying eyes.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group