โ03-04-2019 11:58 PM
I have 10+ columns and want to take distinct rows by multiple columns into consideration. How to achieve this using pyspark dataframe functions ?
โ03-28-2019 08:06 AM
You can use dropDuplicates
https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=distinct#pyspark.sql.Data...
never-displayed
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!