Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-27-2023 02:02 AM
@raela I also have similar usecase. I am writing data to different databricks tables based on colum value.
But I am getting insufficient disk space error and driver is getting killed. I am suspecting
df.select(colName).distinct().collect()
step is taking lot of memory in driver as dataframe is huge.
Is there any recommended way here?