hansonkx
New Contributor II

for those of you looking for a not very complicated solution, you can use the 2 native spark api Soundex and Levenshtein as your fuzzy matching algorithms.

val joinedDF = accountDF.join( accountDF2, levenshtein(accountDF("name"), accountDF2("name")) < 3 && (accountDF("id") !== accountDF2("id")) )

joinedDF.show