Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-14-2019 10:33 AM
I followed the same way what it is in the above article. But did not work for me.
Both df1 & df2 have the same column set of 1006 count. The result created with 2012 columns.
scala> df1.join(df2, Seq("file_name","post_evar30") )
res24: org.apache.spark.sql.DataFrame = [file_name: string, post_evar30: string ... 2012 more fields]