cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Issue with create_auto_cdc_flow Not Updating Business Columns for DELETE Events

kevinzhang29
New Contributor II
We 're currently working with Databricks AUTO CDC in a data pipeline and have encountered an issue with create_auto_cdc_flow (AUTO CDC) when using SCD Type 2.
 
We are using the following configuration:
 
stored_as_scd_type = 2
apply_as_deletes = expr("operation = 'DELETE'")
sequence_by = date_extraction
 
Target is a streaming table created via create_streaming_table
Our source data includes an operation column (INSERT, UPDATE, DELETE) and business fields such as account_close_dt (end date of account).
 
Observed Behavior
When a record arrives with:
operation = 'DELETE'
and a populated business field (e.g., account_close_dt )
 
We observe that:
The existing SCD2 record is closed (__END_AT is set correctly)
However, business columns such as account_close_dt are NOT updated in the target table
This results in loss of important business information associated with the closing event.
 
Expected Behavior
For our use case, we expect that:
When a "DELETE" event contains updated business fields (e.g., account closure date),
Those fields should be reflected in the closing record
Currently, neither occurs — only __END_AT is updated.
 
Key Question
Is this behavior expected when using apply_as_deletes?
 
Specifically:
When an event matches apply_as_deletes, is it always treated as a pure delete (i.e., only closing the record), ignoring all other column updates?
Is there any supported way within create_auto_cdc_flow to:
propagate business column updates along with delete events?
 
I would appreciate guidance on best practices for this pattern when using AUTO CDC.
1 REPLY 1

pradeep_singh
Contributor

Operation type DELETE means the record is supposed to disappear. If you were using SCD Type 1, the record would be removed from the silver table. When using SCD Type 2, AUTO CDC only updates the lifecycle metadata columns to make the record inactive; it does nothing to any other business columns.

For your use case, the only option is to convert the DELETE operation into an UPDATE operation before it reaches the AUTO CDC logic. If you have a view between your bronze and silver layers, you can use that view to add custom logic that checks:

  • if the business key column is not null, and
  • the operation type is DELETE,

then change the operation type to UPDATE. You might also want to add another flag that records that the original operation type was DELETE.

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev