Spark 3.0 Pandas UDF
Old vs New Pandas UDF interface
This slide shows the difference between the old and the new interface. The same here. The new interface can also be used for the existing Grouped Aggregate Pandas UDFs. In addition, the old Pandas UDF was split into two API categories: Pandas UDFs and Pandas function APIs. You can treat Pandas UDFs in the same way that you use the other PySpark column instance.
For example, here, calculate the values. You are calling the Pandas UDF calculate. We do support the new Pandas UDF types from iterators of series to iterator other series and from iterators of multiple series to iterator of series. So this is useful for [inaudible] state initialization of your Pandas UDFs and also useful for Pandas UDF parquet.
However, you can now use Pandas function APIs with this column instance. Here are these two examples: map Pandas function API and the core group, the map Pandas UDF, the APIs. These APIs are newly added in these units.