-werners-
Esteemed Contributor III

why do you use a UDF to add columns?  You could write a pyspark function without registering it as a UDF.

A UDF will bypass all optimizations and, when you use Python, performance is bad.

Can you share your UDF code?