cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to use Java MaskFormatter in sparksql?

Databrickguy
New Contributor II

I create a function based on Java MaskFormatter function in Databricks/Scala.

But when I call it from sparksql, I received error message

Error in SQL statement: AnalysisException: Undefined function: formatAccount. This function is neither a built-in/temporary function, nor a persistent function that is qualified as spark_catalog.default.formataccount.; line 1 pos 32

Here is my function

import javax.swing.text.MaskFormatter
 
def formatAccount(account: String, mask:String ) : String =
 {
  val formatter = new MaskFormatter(mask.replace("X", "A"))
  formatter.setValueContainsLiteralCharacters(false)
  val formatAccount = formatter.valueToString(account)
   formatAccount
 }

Here is the query code which received the error message

sql("""select java_method(emitToKafka ,formatAccount("1222233334", "X-XXXX-XXXX-X"))""")

However if I run below code, it works fine.

formatAccount("1222233334", "X-XXXX-XXXX-X")
res0: String = 1-2222-3333-4

what could be missed?

1 REPLY 1

Anonymous
Not applicable

@Tim zhang​ :

The issue is that the formatAccount function is defined as a Scala function, but SparkSQL is looking for a SQL function. You need to register the Scala function as a SQL function so that it can be called from SparkSQL. You can register the Scala function as a SQL function using the spark.udf.register method. Here is an example code snippet:

import org.apache.spark.sql.functions.udf
 
val formatAccountUDF = udf((account: String, mask:String ) => {
  val formatter = new MaskFormatter(mask.replace("X", "A"))
  formatter.setValueContainsLiteralCharacters(false)
  val formatAccount = formatter.valueToString(account)
  formatAccount
})
 
spark.udf.register("formatAccount", formatAccountUDF)

After registering the function, you can use it in your SparkSQL queries:

sql("""select formatAccount("1222233334", "X-XXXX-XXXX-X")""")

Hopefully this should work.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.