I have 4 months of data and I partitioned it on Year and Month column, so my parquet partition looks like
Data is present inside each monthly partition folder in parquet format.
Then I loaded data for July and also modified some values in August. After generating the required data, I tried to save the output with same partition (Year,Month) as before this time the data did not have September, October and November entries but only for 2 months. The result is as
Interestingly when I look unto the month of September, the parquet files are missing, same in October and November. The folders have some modifications based on last modified.
I am guessing that the files had gone missing because the 2nd write had overridden the 1st one because the partition was also on year. Is there a way to overcome this problem and avoid files to be deleted?