Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-29-2017 11:01 AM
for those of you looking for a not very complicated solution, you can use the 2 native spark api Soundex and Levenshtein as your fuzzy matching algorithms.
val joinedDF = accountDF.join( accountDF2, levenshtein(accountDF("name"), accountDF2("name")) < 3 && (accountDF("id") !== accountDF2("id")) )joinedDF.show