Hi Team,
We have our Bronze(append) Silver(append) and Gold(merge) Tables loaded using spark streaming continuously with trigger as processing time(3 secs).
We Also Run our Maintenance Job on the Table like OPTIMIZE,VACCUM and we perform DELETE for some tables with a datetime retention policy.
In such cases we see that our jobs often fails stating the underlying source file for deleted, or missing or updated...
I want to understand what is the optimized design or approach for my streaming process to perform this kind of Maintenance without affecting my streaming.
Thanks,
Naveen