Divide a dataframe into multiple smaller dataframes based on values in multiple columns in Scala
I have to divide a dataframe into multiple smaller dataframes based on values in columns like - gender and state , the end goal is to pick up random samples from each dataframeI am trying to implement a sample as explained below, I am quite new to th...
- 8737 Views
- 2 replies
- 0 kudos
Latest Reply
@raela I also have similar usecase. I am writing data to different databricks tables based on colum value.But I am getting insufficient disk space error and driver is getting killed. I am suspecting df.select(colName).distinct().collect()step is taki...
- 0 kudos