cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Dynamic Python UDF in unity catalog

Dnirmania
New Contributor III

Hi Team 

I am trying to create a python UDF which I want to use for column masking. This function will take 2 input parameters(column name and groups name) and return the column value if user is part of group otherwise return masked value. I wrote following line of code but it seems we can't uses spark.sql in python UDF to execute sql query.

Can anyone help me to execute SQL query in python function?

 
CREATE OR REPLACE FUNCTION ryanlakehouse.default.column_masking(column_value STRING, groups_str String)
RETURNS STRING
LANGUAGE PYTHON
AS $$

def mask_column(column, groups_str):
    groups = groups_str.split(',')
    for group in groups:
        if spark.sql(f"SELECT is_account_group_member('{group}')").collect()[0][0]:
            return column
    return '******'

column = "{column_value}"
groups_str = groups_str

return mask_column(column, groups_str)
$$;
1 ACCEPTED SOLUTION

Accepted Solutions

menotron
Valued Contributor

Hi @Dnirmania, You could achieve something similar using this UDF:

%sql
CREATE OR REPLACE FUNCTION ryanlakehouse.default.column_masking(column_value STRING, groups_str String) 
  RETURNS STRING 
  LANGUAGE SQL
  COMMENT 'Return the column value if user is part of group otherwise return masked value' 
  RETURN CASE
  WHEN array_contains(
    transform(
      split(groups_str, ','),
      x -> is_account_group_member(trim(x))
    ),
    True
  ) THEN column_value
  ELSE repeat("*", len(column_value))
END

 

View solution in original post

2 REPLIES 2

menotron
Valued Contributor

Hi @Dnirmania, You could achieve something similar using this UDF:

%sql
CREATE OR REPLACE FUNCTION ryanlakehouse.default.column_masking(column_value STRING, groups_str String) 
  RETURNS STRING 
  LANGUAGE SQL
  COMMENT 'Return the column value if user is part of group otherwise return masked value' 
  RETURN CASE
  WHEN array_contains(
    transform(
      split(groups_str, ','),
      x -> is_account_group_member(trim(x))
    ),
    True
  ) THEN column_value
  ELSE repeat("*", len(column_value))
END

 

Dnirmania
New Contributor III

Thank you so much @menotron. Your suggestion worked perfectly.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group