Pivot on multiple columns

memo — Tue, 28 Nov 2023 11:50:43 GMT

I want to pass multiple column as argument to pivot a dataframe in pyspark pivot like

mydf.groupBy("id").pivot("day","city").agg(F.sum("price").alias("price"),F.sum("units").alias("units")).show().

One way I found is to create multiple df with different pivot and join them which will result in multiple scan. But is there any other way to do this?

Re: Pivot on multiple columns

memo — Wed, 29 Nov 2023 08:33:26 GMT

Like how will the pass multiple values to the pivot function? It only takes one argument. I tried with sending an array, list. But it is throwing errors

topic Pivot on multiple columns in Get Started Discussions

Pivot on multiple columns

Re: Pivot on multiple columns