Resolved! PySpark UDF is taking long to process
Hi,I have UDF which runs for each spark dataframe row, does some complex processing and return string output. But it takes very long if data is 15000 rows. I have configured cluster with autoscaling, but its not spinning more servers.Please suggest h...
- 8977 Views
- 3 replies
- 5 kudos
Latest Reply
Hi @Sanjay Jain​ , Python UDFs are generally slower to process because it runs mostly in the driver which can also lead to OOM errors on Driver. To resolve this issue, please consider the below:Use spark built-in functions to do the same functionalit...
- 5 kudos