Try this (in 1.4.0):
val blockSize = 1024 * 1024 * 16 // 16MB
sc.hadoopConfiguration.setInt( "dfs.blocksize", blockSize )
sc.hadoopConfiguration.setInt( "parquet.block.size", blockSize )
Where sc is your SparkContext (not SQLContext).
Not that ...
I think I'm experiencing something similar.
Not using S3 yet. But reading Parquet tables into DataFrames, trying tactics like persist, coalesce, repartition after reading from Parquet. Using HiveContext, if that matters. But I get the impression tha...