I am using the distributed Pandas on Spark, not the single node Pandas.
But when I try to run the following code to transform a data frame with 652 x 729803 data points
df_ps_pct = df.pandas_api().pct_change().to_spark()
, it returns me this error: Driver is up but is not responsive, likely due to GC.
I already follow this guide Spark job fails with Driver is temporarily unavailable - Databricks to stop using single node Pandas.
My ultimate goal is to calculate the `pct_change()` on the spark data frame.
However, as Spark does not have `pct_change()`, so I change the Spark data frame to Pandas on Spark first and then I change it back to Spark.