Hi,
Cluster Configuration details:
RDS Configuration Details:
I have 30 files, each file having 540000 records
I read all files and created one dataframe.
When i write dataframe(16,200,000 records) to a table it take more time nearly more than 1 hour (sometime it will fail saying "Connection time out error")
When i read all 30 files in multithreading and write dataframes to table (30 threads, 30 dataframes, each dataframe having 540000 records) it takes nearly 30 minutes without any error.
I want understand why writing one dataframe takes more time?