Core Problem
- Bronze table is not append-only, but truncate + insert every second.
- DLT (Delta Live Tables) in continuous mode assumes append-only streaming sources (like Kafka).
- Because Qlik wipes and replaces data every second, DLT cannot guarantee no data loss if you read bronze directly in streaming mode.
Why This Breaks Streaming
Streaming queries in Databricks track offsets or files appended.
If Qlik truncates, then:
The data that was there is gone.
DLT sees the same table โstart overโ every second โ leads to lost micro-batches.
No checkpointing can recover truncated rows.
So in the current setup, youโre effectively treating the bronze table like a volatile cache, not a durable streaming source.
Options to Solve This
1. Add a Durable Append Layer Before DLT
Instead of pointing DLT to the truncate-load bronze table, introduce an append-only ingestion layer.
Example:
Qlik โ writes to staging (truncate every sec).
A lightweight job (Auto Loader or Structured Streaming with foreachBatch) โ copies new rows into an append-only Delta table (true bronze).
DLT (continuous) โ reads from this append-only table safely.
This decouples Qlikโs truncate pattern from your streaming system.
2. Snapshot Approach (Batch DLT) - i would recommend this
If you must keep truncate load, then treat each secondโs truncate-load as a full snapshot.
DLT can run in triggered batch mode every second (or every few seconds):
Compare the new snapshot with the last snapshot.
Compute delta changes (insert/update/delete).
Write results to silver.
Downside: not true โstreaming,โ but avoids data loss.