Hi everyone,
I'm encountering an issue after upgrading to Databricks Runtime 16.3, while using Autoloader with the following configuration:
trigger(availableNow=True)
outputMode("overwrite")
When a new file arrives, Autoloader processes it and writes the data to a Delta table. However, I consistently observe two micro-batches being triggered:
First micro-batch ingests the file and writes data to the Delta table.
Second micro-batch, triggered just a few seconds later, finds no new files and still executes in overwrite mode โ which ends up removing the previously written data.
This behavior is confirmed in the Delta table history:
Version 422: file ingested, 3848 rows written.
Version 423: file removed, no new data written.
Checkpoint directory also shows commit files 422 and 423, confirming two micro-batches.
This issue started occurring after we upgraded to DBR 16.3. Prior to that, the overwrite behavior was stable and did not remove data unexpectedly.
Has anyone else encountered this issue? Is there a recommended way to prevent empty micro-batches from overwriting the table?
Any guidance or best practices would be greatly appreciated!
Thanks in advance.