I'm using Delta Live Tables (DLT) with the apply_changes API to manage SCD Type 1 on a source table. However, I’ve run into a limitation.
Context:
After apply_changes, I have a derived view that:
Flattens and explodes a JSON array field in the source data.
Produces multiple rows per original primary key (key_id).
Needs to be written to a destination Delta table, such that:
Challenge:
apply_changes processes one record at a time, which doesn’t suit my case, where multiple rows need to be updated for one incoming key. This makes it hard to cleanly replace existing groups of records in the target.
Question:
Is there a native way (or recommended pattern) in Databricks/DLT to support this kind of "overwrite-by-key" logic, where we delete all rows by key and insert multiple transformed rows, preferably in a streaming or near-real-time pipeline?