Hi Team,
I have Below Scenario,
I have a Spark Streaming Job with trigger of Processing time as 3 secs Running Continuously 365 days.
We are performing a weekly delete job from the source of this streaming job based on custom retention policy. it is a Delete command on the delta table(external).
If i implement SkipChangeCommit to True in my ReadStream, Will i have an Dataloss in my streaming Job...
My source is Bronze delta lake external table loaded in append mode only.
The Reason i want to make sure is the option will skip the entire commit so i want to know if both my weekly delete and an insert to my source data might fall under same commit and the option will skip the entire commit causing the data loss.
Please review and scenario and let me know if there is a potential data loss possibility with this option.