Exporting delta table to one CSV
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-01-2023 02:39 PM
Process to export a delta table is taking ~2hrs.
Delta table has 66 partitions with total size of ~6gb, 4million rows and 270 columns.
Used below command
df.coalesce(1).write.csv("path")
what are my options to reduce the time?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-02-2023 01:58 PM - edited 12-02-2023 02:02 PM
A very interesting task in front of you.... let me know how you solve it!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-04-2023 11:07 AM
Hi Kainz,
None of the options I tried helped as the challenge is not reading but writing it to a one CSV file. df.repartition(numFiles).write.csv("path") has consumed the same amount of time as 'df.coalesce(1).write.csv("path")' in my case.
any other options I can explore?

