def squared(s):
return s * s
spark.udf.register("squaredWithPython", squared)
You can optionally set the return type of your UDF. The default return type is
StringType
.
from pyspark.sql.types import LongType
def squared_typed(s):
return s * s
spark.udf.register("squaredWithPython", squared_typed, LongType())
spark.range(1, 20).createOrReplaceTempView("test")
%sql select id, squaredWithPython(id) as id_squared from test