How to use Java MaskFormatter in sparksql?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-13-2023 09:42 AM
I create a function based on Java MaskFormatter function in Databricks/Scala.
But when I call it from sparksql, I received error message
Error in SQL statement: AnalysisException: Undefined function: formatAccount. This function is neither a built-in/temporary function, nor a persistent function that is qualified as spark_catalog.default.formataccount.; line 1 pos 32Here is my function
import javax.swing.text.MaskFormatter
def formatAccount(account: String, mask:String ) : String =
{
val formatter = new MaskFormatter(mask.replace("X", "A"))
formatter.setValueContainsLiteralCharacters(false)
val formatAccount = formatter.valueToString(account)
formatAccount
}Here is the query code which received the error message
sql("""select java_method(emitToKafka ,formatAccount("1222233334", "X-XXXX-XXXX-X"))""")However if I run below code, it works fine.
formatAccount("1222233334", "X-XXXX-XXXX-X")res0: String = 1-2222-3333-4what could be missed?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-10-2023 07:57 AM
@Tim zhang :
The issue is that the formatAccount function is defined as a Scala function, but SparkSQL is looking for a SQL function. You need to register the Scala function as a SQL function so that it can be called from SparkSQL. You can register the Scala function as a SQL function using the spark.udf.register method. Here is an example code snippet:
import org.apache.spark.sql.functions.udf
val formatAccountUDF = udf((account: String, mask:String ) => {
val formatter = new MaskFormatter(mask.replace("X", "A"))
formatter.setValueContainsLiteralCharacters(false)
val formatAccount = formatter.valueToString(account)
formatAccount
})
spark.udf.register("formatAccount", formatAccountUDF)After registering the function, you can use it in your SparkSQL queries:
sql("""select formatAccount("1222233334", "X-XXXX-XXXX-X")""")Hopefully this should work.