Any on please suggest how we can effectively loop through PySpark Dataframe .

Ancil
Contributor II

Scenario: I Have a dataframe with more than 1000 rows, each row having a file path and result data column. I need to loop through each row and write files to the file path, with data from the result column.

what is the easiest and time effective way to do this?

I tried with collect and it's taking long time.

And I tried UDF methods but getting below error

image