Re: How to process a large delta table with UDF ?

Hubert-Dudek · ‎03-24-2022

That udf code will run on driver so better not use it for such a big dataset. What you need is vectorized pandas udf https://docs.databricks.com/spark/latest/spark-sql/udf-python-pandas.html

My blog: https://databrickster.medium.com/