What if I have lot of empty shuffled partitions due to data skewness Secondly , if the shuffle partition size is 128 MB and if the size of the key's partition is 700 MB
I have follow up questions here :1) OP mentions about the 1 GB of data in each folder. So , the spark will read ~8 partitions on 8 cores(if there ) ?2)what if I get empty partitions after shuffle?