Databricks Community

Mey · ‎02-09-2026

Hello Guys,

I am planning to implement Event Driven Data Ingestion from Bronze -> Silver -> Gold layer in my project. Currently we are having batch processing approach for our data ingestion pipelines. We have decided to move away from batch process to Event Driven approach. Can some one guide me / throw some light on architectural design, steps / key factors I have to capture before design the CDC / Event Driven Scalable architecture. Also it would be good if you provide some examples or documents as a sample for my verification.

Thanks for all the helps so far!!!

bianca_unifeye · ‎02-10-2026

Moving from batch → event-driven / CDC on Databricks usually means adopting streaming + incremental processing across the Bronze → Silver → Gold (Medallion) layers.

Key design factors to capture upfront

Event source: Kafka / Event Hubs / Kinesis / Debezium / app events
CDC strategy: source-side CDC vs Delta Change Data Feed (CDF)
Exactly-once & ordering: idempotent writes, keys, watermarking
Schema evolution: schema enforcement vs evolution at Bronze
Data quality: quarantine bad records early (expectations)
Scalability & recovery: checkpoints, replay, backfills
Latency vs cost: micro-batch vs continuous triggers
Governance: Unity Catalog, lineage, access controls

Typical Databricks pattern

Bronze: Event ingestion using Auto Loader / streaming, raw append
Silver: Apply CDC using Delta Live Tables (apply_changes) or Delta CDF
Gold: Incremental aggregates / serving tables (streaming or triggered)

Recommended Databricks documentation

Medallion architecture:
https://docs.databricks.com/lakehouse/medallion.html
Auto Loader & streaming pipelines (Lakeflow / DLT):
https://docs.databricks.com/ldp/tutorial-pipelines.html
Delta Lake Change Data Feed (CDF):
https://docs.databricks.com/delta/delta-change-data-feed.html
CDC with Delta Live Tables (apply_changes):
https://www.databricks.com/blog/2022/04/25/simplifying-change-data-capture-with-databricks-delta-liv...
Lakeflow / DLT CDC APIs:
https://docs.databricks.com/ldp/cdc.html
Data quality & expectations:
https://docs.databricks.com/ldp/expectations.html

This combination gives you event-driven ingestion, scalable CDC, built-in data quality, and recoverability while staying fully aligned with Databricks best practices.

View solution in original post

bianca_unifeye · ‎02-10-2026

Moving from batch → event-driven / CDC on Databricks usually means adopting streaming + incremental processing across the Bronze → Silver → Gold (Medallion) layers.

Key design factors to capture upfront

Event source: Kafka / Event Hubs / Kinesis / Debezium / app events
CDC strategy: source-side CDC vs Delta Change Data Feed (CDF)
Exactly-once & ordering: idempotent writes, keys, watermarking
Schema evolution: schema enforcement vs evolution at Bronze
Data quality: quarantine bad records early (expectations)
Scalability & recovery: checkpoints, replay, backfills
Latency vs cost: micro-batch vs continuous triggers
Governance: Unity Catalog, lineage, access controls

Typical Databricks pattern

Bronze: Event ingestion using Auto Loader / streaming, raw append
Silver: Apply CDC using Delta Live Tables (apply_changes) or Delta CDF
Gold: Incremental aggregates / serving tables (streaming or triggered)

Recommended Databricks documentation

Medallion architecture:
https://docs.databricks.com/lakehouse/medallion.html
Auto Loader & streaming pipelines (Lakeflow / DLT):
https://docs.databricks.com/ldp/tutorial-pipelines.html
Delta Lake Change Data Feed (CDF):
https://docs.databricks.com/delta/delta-change-data-feed.html
CDC with Delta Live Tables (apply_changes):
https://www.databricks.com/blog/2022/04/25/simplifying-change-data-capture-with-databricks-delta-liv...
Lakeflow / DLT CDC APIs:
https://docs.databricks.com/ldp/cdc.html
Data quality & expectations:
https://docs.databricks.com/ldp/expectations.html

This combination gives you event-driven ingestion, scalable CDC, built-in data quality, and recoverability while staying fully aligned with Databricks best practices.