Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-03-2023 05:51 AM
I'm not sure about that. When you call the function to_excel all the data is loaded into the driver (as if you were doing a collect). So, the writing is not distributed and you can have memory and performance problems as I mentioned.
Try writing with this library:
https://github.com/crealytics/spark-excel
Example (https://github.com/crealytics/spark-excel/issues/134#issuecomment-517696354):
df.write
.format("com.crealytics.spark.excel")
.save("test.xlsx")