cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

replicate the behaviour of DLT create auto cdc flow

hidden
New Contributor II

I want to custom write the behaviour of DLT create auto cdc flow . how can we do it 

 

3 REPLIES 3

nayan_wylde
Esteemed Contributor

As of late 2025, Databricksโ€™ Lakeflow Spark Declarative Pipelines (SDP) introduced create_auto_cdc_flow() (Python) and AUTO CDC ... INTO (SQL), which replace the older DLT apply_changes API and let you customize the CDC behavior declarativelyโ€”keys, sequencing, delete/truncate handling, SCD1 vs SCD2, column-level history, null-update rules, and more.
https://docs.databricks.com/aws/en/ldp/cdc

Poorva21
New Contributor

create_auto_cdc_flow() is the new API replacing DLT apply_changes(), used to build declarative CDC pipelines on Delta Change Data Feed (CDF). It ingests inserts, updates, and deletes from a CDC source and applies them into a target streaming table you define. You specify keys (PK), sequence_by (event ordering), and customize behavior like null-handling, delete logic, truncation logic, column filtering, and SCD Type 1 or 2 storage. Deletes can be interpreted via apply_as_deletes, which uses temporary tombstones with configurable retention. Full table truncation can be triggered using apply_as_truncates (SCD Type 1 only).You can include/exclude specific columns and configure which columns track history. SCD2 requires the target table to include special columns __START_AT and __END_AT with matching type to sequence_by. Supports once=True for backfills (runs as batch). Works only with target streaming tables created using create_streaming_table().

https://docs.databricks.com/aws/en/ldp/developer/ldp-python-ref-apply-changes

Hubert-Dudek
Esteemed Contributor III

And you need to handle dozens of exceptions, such as late-arriving data, duplicate data, data in the wrong order, etc.