cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Is there a way to CONCAT two dataframes on either of the axis (row/column) and transpose the dataframe in PySpark?

RiyazAli
Valued Contributor

I'm reshaping my dataframe as per requirement and I came across this situation where I'm concatenating 2 dataframes and then transposing them. I've done this previously using pandas and the syntax for pandas goes as below:

import pandas as pd
 
df1 = pd.DataFrame(some_dict)
df2 = pd.DataFrame(some_dict)
 
new_df = pd.concat( [df1, df2], axis = "column") #stacking the dfs side by side
 
trans_df = new_df.transpose() or simply new_df.T

Is there a way I could do this in PySpark? Any leads would be greatly appreciated.

3 REPLIES 3

RiyazAli
Valued Contributor

Hi @Kaniz Fatmaโ€‹ ,

I no longer see the answer you've posted, but I see you were suggesting to use `union`. As per my understanding, union are used to stack the dfs one upon another with similar schema / column names.

In my situation, I have 2 different DataFrames with different columns (and schema) but same number of records. I want to stack them side by side.

For e.g: DF1 has 2 columns(a and b) and 10 rows and DF2 has 3 columns (x,y, and z) with same 10 rows. I want the resultant DataFrame to be with 10 rows and 5 columns (a,b,c,d and e).

Thank you ๐Ÿ˜€

Thanks @Kaniz Fatmaโ€‹, this has solved half of the problem, the other half is that I need to Transpose the pyspark dataframe. Any help on this one?

Greatly appreciate the help @Kaniz Fatmaโ€‹ !!

Eventhough I had to make multiple tweaks to the TransposeDF function, it gave me the idea to begin with. Thanks for the prompt response, I was able to wrap-up this issue with in this day!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group