I followed the same way what it is in the above article. But did not work for me.
Both df1 & df2 have the same column set of 1006 count. The result created with 2012 columns.
scala> df1.join(df2, Seq("file_name","post_evar30") )
res24: org.apache.s...