topic How to take distinct of multiple columns ( > than 2 columns) in pyspark datafarme ? in Data Engineering

How to take distinct of multiple columns ( > than 2 columns) in pyspark datafarme ?

srchella — Tue, 05 Mar 2019 07:58:17 GMT

I have 10+ columns and want to take distinct rows by multiple columns into consideration. How to achieve this using pyspark dataframe functions ?

Sandeep — Thu, 28 Mar 2019 15:06:05 GMT

You can use dropDuplicates