Palash01
Valued Contributor

Hey @ksamborn 

I can think of 2 solutions:

  1. Rename the column in df_2 before joining:

 

df_1_alias = df_1.alias("t1")
df_2_alias = df_2.alias("t2")
join_df = df_1_alias.join(df_2_alias, df_1_alias.key == df_2_alias.key)
rename_df = join_df.withColumnRenamed("t2.data1", "rename_data1")

 

 2. Use aliases for the join tables before the join:

 

df_1_alias = df_1.alias("t1")
df_2_alias = df_2.alias("t2")
join_df = df_1_alias.join(df_2_alias, df_1_alias.key == df_2_alias.key)
rename_df = join_df.withColumnRenamed("t2.data1", "rename_data1")

 

Let us know if this works otherwise followups are appreciated. 

 

Leave a like if this helps! Kudos,
Palash