groupBy without aggregation (Pyspark API)

William_Scardua
Valued Contributor

Hi guys,

You have any idea how can I do a groupBy without aggregation (Pyspark API)

like:

 

df.groupBy('field1', 'field2', 'field3')

 

My target is make a group but in this case is not necessary count records or aggregation

Thank you

 

 

feiyun0112
Honored Contributor
df.select("field1","field2","field3").distinct()

do you mean get distinct rows for selected column?