Your masking UDF returns a VARIANT, but it is being applied as a column-level security (CLS) policy on a column with a fixed data type (e.g., DATE).
Inside a CLS policy, Databricks requires that the final returned value must cast back into the target columnโs data type.
This is not enforced when you call the function directly (because VARIANT can hold anything), but is enforced when CLS rewrites the query, because the engine injects something similar to:
CAST(mask_column(value, 'TAG') AS DATE)
Then this happens:
When CLS checks your lookup table, the EXISTS(...) subquery returns FALSE,
So the function moves to the masking cases,
For DATE, the default mask is '1970-01-01'::VARIANT โ that part is OK,
But before hitting the DATE fallback, the engine evaluates all CASE branches,
and the INT fallback (0::VARIANT) is also part of the CASE expression tree,
The engine tries to determine a common supertype of all CASE branches,
It incorrectly concludes that the CASE can return an INT-like VARIANT,
And when casting back to DATE, the VARIANT 0 cannot be cast to DATE โ error:
[INVALID_VARIANT_CAST] The variant value `0` cannot be cast into "DATE".
This only happens when:
โ CLS is used
โ AND the UDF queries another table (EXISTS)
โ AND the masked column is DATE/TIMESTAMP
โ AND the UDF returns numeric values for other types
This is not encountered when simply calling the UDF directly.