A column contains encrypted data at rest. I am trying to create a sql function which will decrypt the data if the user is a part of a particular group. Below is the function:
%sql
CREATE OR REPLACE FUNCTION test.default.decrypt_if_valid_user(col_a STRING)
RETURN CASE WHEN is_account_group_member('admin') THEN test.default.decrypt_colA (col_a ,secret('fernet_key', 'fernet_key_secret'))
ELSE col_a
END
Here "test.default.decrypt_colA" is already created. When I ran the query to retreive data I got decrypted data.
%sql
select test.default.decrypt_if_valid_user(col_a) from test.default.sampletbl limit 2
With this I am getting decrypted data.
Now I applied this function directly on column by altering the table like this:
%sql
ALTER TABLE tes.default.sampletbl ALTER COLUMN col_a SET MASK test.default.decrypt_if_valid_user
Now when I try to query the above table I am getting below error:
%sql
select * from test.default.sampletbl limit 2
org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot generate code for expression: test.default.decrypt_colA (input[11, string, true], secret_value)
at org.apache.spark.SparkException$.internalError(SparkException.scala:85)
at org.apache.spark.SparkException$.internalError(SparkException.scala:89)
at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotGenerateCodeForExpressionError(QueryExecutionErrors.scala:77)
at org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode(Expression.scala:503)
at org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode$(Expression.scala:502)
at com.databricks.sql.analyzer.ExternalUDFExpression.doGenCode(ExternalUDFExpression.scala:37)
at org.apache.spark.sql.catalyst.expressions.Expression.genCodeInternal(Expression.scala:249)
at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$2(Expression.scala:225)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.catalyst.expressions.Expression.genCode(Expression.scala:225)
at org.apache.spark.sql.catalyst.expressions.Alias.genCodeInternal(namedExpressions.scala:170)
at com.databricks.sql.expressions.codegen.EdgeExpressionCodegen$.$anonfun$genCodeWithFallback$2(EdgeExpressionCodegen.scala:269)
at scala.Option.getOrElse(Option.scala:189)
at com.databricks.sql.expressions.codegen.EdgeExpressionCodegen$.$anonfun$genCodeWithFallback$1(EdgeExpressionCodegen.scala:269)
at scala.Option.getOrElse(Option.scala:189)
at com.databricks.sql.expressions.codegen.EdgeExpressionCodegen$.genCodeWithFallback(EdgeExpressionCodegen.scala:267)
at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.generateExpression(CodeGenerator.scala:1450)
at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.$anonfun$generateExpressionsForWholeStageWithCSE$2(CodeGenerator.scala:1531)
at scala.collection.immutable.List.map(List.scala:297)
at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.$anonfun$generateExpressionsForWholeStageWithCSE$1(CodeGenerator.scala:1529)
at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.withSubExprEliminationExprs(CodeGenerator.scala:1183)
at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.generateExpressionsForWholeStageWithCSE(CodeGenerator.scala:1529)
at org.apache.spark.sql.execution.ProjectExec.doConsume(basicPhysicalOperators.scala:76)
at org.apache.spark.sql.execution.CodegenSupport.consume(WholeStageCodegenExec.scala:199)
at org.apache.spark.sql.execution.CodegenSupport.consume$(WholeStageCodegenExec.scala:154)
at org.apache.spark.sql.execution.ColumnarToRowExec.consume(Columnar.scala:78)
at org.apache.spark.sql.execution.ColumnarToRowExec.doProduce(Columnar.scala:218)
at org.apache.spark.sql.execution.CodegenSupport.$anonfun$produce$1(WholeStageCodegenExec.scala:99)
at org.apache.spark.sql.execution.SparkPlan$.org$apache$spark$sql$execution$SparkPlan$$withExecuteQueryLogging(SparkPlan.scala:107)
Any idea on how to resolve this issue ?