I am using Autoloader to load files from a directory. I have set up File Notification with the Event Subscription.
I have a backfill interval set to 1 day and have not run the stream for a week. There should only be about ~100 new files to pick up and the stage states it completes in the Spark UI.
However, the job does not write and stalls for a long time. Then does not complete the write over. When going to the Driver Logs, I see messages like this.
2023-02-10T18:35:04.867+0000: [GC (Heap Inspection Initiated GC) [PSYoungGen: 2625154K->11041K(15486464K)] 2861020K->246915K(46883840K), 0.0116171 secs] [Times: user=0.09 sys=0.00, real=0.01 secs]
2023-02-10T18:35:04.878+0000: [Full GC (Heap Inspection Initiated GC) [PSYoungGen: 11041K->0K(15486464K)] [ParOldGen: 235874K->231400K(31397376K)] 246915K->231400K(46883840K), [Metaspace: 291018K->291018K(313344K)], 0.1842356 secs] [Times: user=0.79 sys=0.00, real=0.18 secs]
about every 20 mins.
The job has been stalled for hours, I have tried increasing and decreasing the cluster.
I do not want to have to reset the checkpoint and start over.
Thanks