Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-10-2022 09:06 PM
Thank you for the reply.
We tried to convert Pyspark df to Pandas df to achieve the expected JSON format. But due to below issues we stopped the conversion process
- Our Pyspark dataframe is very huge we can say around 400 Million+ rows , so our output should be in multiple files. As pyspark df is distributed one we no need to worry about for the multiple file logic. But where as pandas df is single cpu one it will generate a huge single output file.
- When we tried to convert the pyspark df to pandas df its getting failed as our dataframe contains deeply nested attributes