cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

[INTERNAL_ERROR] Cannot generate code for expression: claimsconifer.default.decrypt_colA(

nikhilkumawat
New Contributor III

A column contains encrypted data at rest. I am trying to create a sql function which will decrypt the data if the user is a part of a particular group. Below is the function:

 

%sql
CREATE OR REPLACE FUNCTION test.default.decrypt_if_valid_user(col_a STRING) 
RETURN CASE WHEN is_account_group_member('admin') THEN test.default.decrypt_colA (col_a ,secret('fernet_key', 'fernet_key_secret'))
    ELSE col_a
  END

 

Here "test.default.decrypt_colA" is already created. When I ran the query to retreive data I got decrypted data.

 

%sql
select test.default.decrypt_if_valid_user(col_a) from test.default.sampletbl limit 2

 

With this I am getting decrypted data. 

Now I applied this function directly on column by altering the table like this:

 

%sql
ALTER TABLE tes.default.sampletbl ALTER COLUMN col_a SET MASK test.default.decrypt_if_valid_user

 

Now when I try to query the above table I am getting below error:

 

%sql
select * from test.default.sampletbl limit 2
org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot generate code for expression: test.default.decrypt_colA (input[11, string, true], secret_value)
	at org.apache.spark.SparkException$.internalError(SparkException.scala:85)
	at org.apache.spark.SparkException$.internalError(SparkException.scala:89)
	at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotGenerateCodeForExpressionError(QueryExecutionErrors.scala:77)
	at org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode(Expression.scala:503)
	at org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode$(Expression.scala:502)
	at com.databricks.sql.analyzer.ExternalUDFExpression.doGenCode(ExternalUDFExpression.scala:37)
	at org.apache.spark.sql.catalyst.expressions.Expression.genCodeInternal(Expression.scala:249)
	at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$2(Expression.scala:225)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.sql.catalyst.expressions.Expression.genCode(Expression.scala:225)
	at org.apache.spark.sql.catalyst.expressions.Alias.genCodeInternal(namedExpressions.scala:170)
	at com.databricks.sql.expressions.codegen.EdgeExpressionCodegen$.$anonfun$genCodeWithFallback$2(EdgeExpressionCodegen.scala:269)
	at scala.Option.getOrElse(Option.scala:189)
	at com.databricks.sql.expressions.codegen.EdgeExpressionCodegen$.$anonfun$genCodeWithFallback$1(EdgeExpressionCodegen.scala:269)
	at scala.Option.getOrElse(Option.scala:189)
	at com.databricks.sql.expressions.codegen.EdgeExpressionCodegen$.genCodeWithFallback(EdgeExpressionCodegen.scala:267)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.generateExpression(CodeGenerator.scala:1450)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.$anonfun$generateExpressionsForWholeStageWithCSE$2(CodeGenerator.scala:1531)
	at scala.collection.immutable.List.map(List.scala:297)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.$anonfun$generateExpressionsForWholeStageWithCSE$1(CodeGenerator.scala:1529)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.withSubExprEliminationExprs(CodeGenerator.scala:1183)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.generateExpressionsForWholeStageWithCSE(CodeGenerator.scala:1529)
	at org.apache.spark.sql.execution.ProjectExec.doConsume(basicPhysicalOperators.scala:76)
	at org.apache.spark.sql.execution.CodegenSupport.consume(WholeStageCodegenExec.scala:199)
	at org.apache.spark.sql.execution.CodegenSupport.consume$(WholeStageCodegenExec.scala:154)
	at org.apache.spark.sql.execution.ColumnarToRowExec.consume(Columnar.scala:78)
	at org.apache.spark.sql.execution.ColumnarToRowExec.doProduce(Columnar.scala:218)
	at org.apache.spark.sql.execution.CodegenSupport.$anonfun$produce$1(WholeStageCodegenExec.scala:99)
	at org.apache.spark.sql.execution.SparkPlan$.org$apache$spark$sql$execution$SparkPlan$$withExecuteQueryLogging(SparkPlan.scala:107)

 

 

Any idea on how to resolve this issue ?

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @nikhilkumawat , 

The error message indicates that the decrypt_if_valid_user the function is not properly recognized when used as a masking function in the ALTER TABLE statement.

The masking feature in Databricks is not designed to work with user-defined functions (UDFs) or external functions like is_account_group_member and secret. The masking feature can only be used with built-in functions that are supported by the masking feature.

One way to solve this is to convert the UDF decrypt_if_valid_user to a built-in function using CREATE OR REPLACE TEMPORARY TABLE FUNCTION. Using this approach, you can define the function directly in SQL rather than in Python:

CREATE OR REPLACE TEMPORARY TABLE FUNCTION mask_decrypt(col_a STRING) 
  RETURNS STRING 
  COMMENT "Decrypts the given column if the user is a member of the 'admin' group"
  LANGUAGE SQL 
  AS "
  CASE 
    WHEN is_account_group_member('admin') 
    THEN test.default.decrypt_colA (col_a, secret('fernet_key', 'fernet_key_secret'))
    ELSE col_a
  END
";

Here, CREATE OR REPLACE TEMPORARY TABLE FUNCTION creates a built-in function that can be used with the masking feature. You can then use this function in the ALTER TABLE statement to mask the column:

ALTER TABLE test.default.sampletbl ALTER COLUMN col_a SET MASK mask_decrypt;

This should allow you to use the mask_decrypt function to mask the col_a column in the sampletbl table. Note that you can only use built-in functions with the masking feature, so any custom functions or external functions must be converted to built-in functions using SQL or the Delta Lake version of the Databricks Spark runtime.

 

nikhilkumawat
New Contributor III

Hi @Kaniz I tried to create same function that you described . It is giving the error:

nikhil1991_0-1697522742982.png

 

nikhilkumawat
New Contributor III

Hi @Kaniz After removing "TABLE" keyword from create or replace statement this function got registered as builtin function. Just to verify that I displayed all the functions and I can see that function--> decrypt_if_valid_user:

nikhil1991_0-1697541657408.png

Now I am trying to alter the table using below command and it is giving below error:

%sql
ALTER TABLE test.default.sampletbl ALTER COLUMN col_a SET MASK decrypt_if_valid_user

nikhil1991_1-1697541801866.png

Although I can see this function in the list but still it is not able recognize that function.

 

 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.