โ06-06-2022 07:21 PM
I'm reshaping my dataframe as per requirement and I came across this situation where I'm concatenating 2 dataframes and then transposing them. I've done this previously using pandas and the syntax for pandas goes as below:
import pandas as pd
df1 = pd.DataFrame(some_dict)
df2 = pd.DataFrame(some_dict)
new_df = pd.concat( [df1, df2], axis = "column") #stacking the dfs side by side
trans_df = new_df.transpose() or simply new_df.T
Is there a way I could do this in PySpark? Any leads would be greatly appreciated.
โ06-07-2022 01:20 AM
Hi @Riyaz Aliโ , Here's a generic transpose method ( TransposeDF) that can transpose spark data frame. Click here to get complete details of the technique. Please let me know if it helps.
โ06-06-2022 11:45 PM
Hi @Kaniz Fatmaโ ,
I no longer see the answer you've posted, but I see you were suggesting to use `union`. As per my understanding, union are used to stack the dfs one upon another with similar schema / column names.
In my situation, I have 2 different DataFrames with different columns (and schema) but same number of records. I want to stack them side by side.
For e.g: DF1 has 2 columns(a and b) and 10 rows and DF2 has 3 columns (x,y, and z) with same 10 rows. I want the resultant DataFrame to be with 10 rows and 5 columns (a,b,c,d and e).
Thank you ๐
โ06-06-2022 11:58 PM
Hi @Riyaz Aliโ, This recipe helps you stack two DataFrames horizontally in Pyspark. Please let me know if that helps.
โ06-07-2022 01:15 AM
Thanks @Kaniz Fatmaโ, this has solved half of the problem, the other half is that I need to Transpose the pyspark dataframe. Any help on this one?
โ06-07-2022 01:20 AM
Hi @Riyaz Aliโ , Here's a generic transpose method ( TransposeDF) that can transpose spark data frame. Click here to get complete details of the technique. Please let me know if it helps.
โ06-07-2022 03:32 AM
Greatly appreciate the help @Kaniz Fatmaโ !!
Eventhough I had to make multiple tweaks to the TransposeDF function, it gave me the idea to begin with. Thanks for the prompt response, I was able to wrap-up this issue with in this day!
โ06-07-2022 03:33 AM
@Riyaz Aliโ , Awesome!
We've sent you the certification coupon too. Please confirm on the thread.
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.