<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic CLS function with lookup fails on dates in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/cls-function-with-lookup-fails-on-dates/m-p/141672#M51784</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I'm conducting research on utilizing CLS in a project. We are implementing a lookup table to determine what tags a user can see. The CLS function looks like this:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;CREATE OR REPLACE FUNCTION {catalog}.{schema}.mask_column(value VARIANT, tag STRING)
  RETURNS VARIANT
  COMMENT 'ABAC utility: Completely mask string values with asterisks'
  RETURN 
  CASE
    WHEN EXISTS (
      SELECT 1
      FROM {catalog}.{schema}.lookup_access
        WHERE (lookup_access.tag = mask_column.tag 
        AND principal = current_user()) 
        AND can_view = true
    ) THEN value
    WHEN schema_of_variant(value) = 'DATE' THEN '1970-01-01'::VARIANT
    WHEN schema_of_variant(value) = 'TIMESTAMP' THEN '1970-01-01'::VARIANT
    WHEN schema_of_variant(value) = 'INT' THEN 0::VARIANT
    WHEN schema_of_variant(value) = 'BIGINT' THEN 0::VARIANT
    WHEN schema_of_variant(value) = 'DOUBLE' THEN 0.00::VARIANT
    WHEN schema_of_variant(value) = 'STRING' THEN '***'::VARIANT
    ELSE NULL::VARIANT
  END;&lt;/LI-CODE&gt;&lt;P&gt;Used alone, it works perfectly fine. It also works as intended when implemented with a policy. But, when using a policy with this function, and selecting a DATE type column from a table, the following error appears:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[INVALID_VARIANT_CAST] The variant value `0` cannot be cast into `"DATE"`. Please use `try_variant_get` instead. SQLSTATE: 22023&lt;/LI-CODE&gt;&lt;P&gt;If I remove the date fields from the select query, the function works as intended. Additionally, if I remove the lookup table query from the UDF, the policy also works for date-type columns.&lt;/P&gt;&lt;P&gt;Why is this happening? Are there any workarounds, or is this a known bug?&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 11 Dec 2025 13:46:30 GMT</pubDate>
    <dc:creator>skuvisk</dc:creator>
    <dc:date>2025-12-11T13:46:30Z</dc:date>
    <item>
      <title>CLS function with lookup fails on dates</title>
      <link>https://community.databricks.com/t5/data-engineering/cls-function-with-lookup-fails-on-dates/m-p/141672#M51784</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I'm conducting research on utilizing CLS in a project. We are implementing a lookup table to determine what tags a user can see. The CLS function looks like this:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;CREATE OR REPLACE FUNCTION {catalog}.{schema}.mask_column(value VARIANT, tag STRING)
  RETURNS VARIANT
  COMMENT 'ABAC utility: Completely mask string values with asterisks'
  RETURN 
  CASE
    WHEN EXISTS (
      SELECT 1
      FROM {catalog}.{schema}.lookup_access
        WHERE (lookup_access.tag = mask_column.tag 
        AND principal = current_user()) 
        AND can_view = true
    ) THEN value
    WHEN schema_of_variant(value) = 'DATE' THEN '1970-01-01'::VARIANT
    WHEN schema_of_variant(value) = 'TIMESTAMP' THEN '1970-01-01'::VARIANT
    WHEN schema_of_variant(value) = 'INT' THEN 0::VARIANT
    WHEN schema_of_variant(value) = 'BIGINT' THEN 0::VARIANT
    WHEN schema_of_variant(value) = 'DOUBLE' THEN 0.00::VARIANT
    WHEN schema_of_variant(value) = 'STRING' THEN '***'::VARIANT
    ELSE NULL::VARIANT
  END;&lt;/LI-CODE&gt;&lt;P&gt;Used alone, it works perfectly fine. It also works as intended when implemented with a policy. But, when using a policy with this function, and selecting a DATE type column from a table, the following error appears:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[INVALID_VARIANT_CAST] The variant value `0` cannot be cast into `"DATE"`. Please use `try_variant_get` instead. SQLSTATE: 22023&lt;/LI-CODE&gt;&lt;P&gt;If I remove the date fields from the select query, the function works as intended. Additionally, if I remove the lookup table query from the UDF, the policy also works for date-type columns.&lt;/P&gt;&lt;P&gt;Why is this happening? Are there any workarounds, or is this a known bug?&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Dec 2025 13:46:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cls-function-with-lookup-fails-on-dates/m-p/141672#M51784</guid>
      <dc:creator>skuvisk</dc:creator>
      <dc:date>2025-12-11T13:46:30Z</dc:date>
    </item>
    <item>
      <title>Re: CLS function with lookup fails on dates</title>
      <link>https://community.databricks.com/t5/data-engineering/cls-function-with-lookup-fails-on-dates/m-p/141693#M51792</link>
      <description>&lt;P&gt;Your masking UDF returns a VARIANT, but it is being applied as a column-level security (CLS) policy on a column with a fixed data type (e.g., DATE).&lt;/P&gt;&lt;P&gt;Inside a CLS policy, Databricks requires that the final returned value must cast back into the target column’s data type.&lt;/P&gt;&lt;P&gt;This is not enforced when you call the function directly (because VARIANT can hold anything), but is enforced when CLS rewrites the query, because the engine injects something similar to:&lt;/P&gt;&lt;P&gt;CAST(mask_column(value, 'TAG') AS DATE)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Then this happens:&lt;/P&gt;&lt;P&gt;When CLS checks your lookup table, the EXISTS(...) subquery returns FALSE,&lt;/P&gt;&lt;P&gt;So the function moves to the masking cases,&lt;/P&gt;&lt;P&gt;For DATE, the default mask is '1970-01-01'::VARIANT — that part is OK,&lt;/P&gt;&lt;P&gt;But before hitting the DATE fallback, the engine evaluates all CASE branches,&lt;/P&gt;&lt;P&gt;and the INT fallback (0::VARIANT) is also part of the CASE expression tree,&lt;/P&gt;&lt;P&gt;The engine tries to determine a common supertype of all CASE branches,&lt;/P&gt;&lt;P&gt;It incorrectly concludes that the CASE can return an INT-like VARIANT,&lt;/P&gt;&lt;P&gt;And when casting back to DATE, the VARIANT 0 cannot be cast to DATE → error:&lt;/P&gt;&lt;P&gt;[INVALID_VARIANT_CAST] The variant value `0` cannot be cast into "DATE".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This only happens when:&lt;/P&gt;&lt;P&gt;✔ CLS is used&lt;/P&gt;&lt;P&gt;✔ AND the UDF queries another table (EXISTS)&lt;/P&gt;&lt;P&gt;✔ AND the masked column is DATE/TIMESTAMP&lt;/P&gt;&lt;P&gt;✔ AND the UDF returns numeric values for other types&lt;/P&gt;&lt;P&gt;This is not encountered when simply calling the UDF directly.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Dec 2025 17:00:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cls-function-with-lookup-fails-on-dates/m-p/141693#M51792</guid>
      <dc:creator>Poorva21</dc:creator>
      <dc:date>2025-12-11T17:00:40Z</dc:date>
    </item>
    <item>
      <title>Re: CLS function with lookup fails on dates</title>
      <link>https://community.databricks.com/t5/data-engineering/cls-function-with-lookup-fails-on-dates/m-p/141734#M51809</link>
      <description>&lt;P&gt;Thank you for an insightful answer&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/199898"&gt;@Poorva21&lt;/a&gt;.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I conclude from your reasoning that this is the result of an optimization/engine error. It seems like I will need to resort to a workaround for the date columns then...&lt;/P&gt;</description>
      <pubDate>Fri, 12 Dec 2025 09:24:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cls-function-with-lookup-fails-on-dates/m-p/141734#M51809</guid>
      <dc:creator>skuvisk</dc:creator>
      <dc:date>2025-12-12T09:24:45Z</dc:date>
    </item>
  </channel>
</rss>

