cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to merge all the columns into one column as JSON?

AmanSehgal
Honored Contributor III

I have a task to transform a dataframe.

The task is to collect all the columns in a row and embed it into a JSON string as a column.

Source DF:

imageTarget DF:

image 

1 ACCEPTED SOLUTION

Accepted Solutions

AmanSehgal
Honored Contributor III

I was able to do this by converting df to rdd and then by applying map function to it.

rdd_1 = df.rdd.map(lambda row: 
                             (row['ID'],
                             row.asDict()
                             )
              )
 
rdd_2_df = rdd_1.toDF(['ID', 'Data'])

View solution in original post

2 REPLIES 2

AmanSehgal
Honored Contributor III

I was able to do this by converting df to rdd and then by applying map function to it.

rdd_1 = df.rdd.map(lambda row: 
                             (row['ID'],
                             row.asDict()
                             )
              )
 
rdd_2_df = rdd_1.toDF(['ID', 'Data'])

Kaniz
Community Manager
Community Manager

Hi @Aman Sehgal​, Thank you for providing the solution here. I'm marking your answer as the best.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.