Here are the simple steps to reproduce it. Note that col "foo" and "bar" are just redundant cols to make sure the dataframe doesn't fit into a single partition. // generate a random df
val rand = new scala.util.Random
val df = (1 to 3000).map(i => (r...