I'm using PySpark on Databricks and trying to pivot a 27753444 X 3 matrix.
If I do it in Spark DataFrame:
df = df.groupBy("A").pivot("B").avg("C")
it takes forever (after 2 hours and I canceled it).
If I convert it to pandas dataframe and then pivo...