- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-09-2026 08:42 PM
Hello Guys,
I am planning to implement Event Driven Data Ingestion from Bronze -> Silver -> Gold layer in my project. Currently we are having batch processing approach for our data ingestion pipelines. We have decided to move away from batch process to Event Driven approach. Can some one guide me / throw some light on architectural design, steps / key factors I have to capture before design the CDC / Event Driven Scalable architecture. Also it would be good if you provide some examples or documents as a sample for my verification.
Thanks for all the helps so far!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-10-2026 04:28 AM
Moving from batch → event-driven / CDC on Databricks usually means adopting streaming + incremental processing across the Bronze → Silver → Gold (Medallion) layers.
Key design factors to capture upfront
Event source: Kafka / Event Hubs / Kinesis / Debezium / app events
CDC strategy: source-side CDC vs Delta Change Data Feed (CDF)
Exactly-once & ordering: idempotent writes, keys, watermarking
Schema evolution: schema enforcement vs evolution at Bronze
Data quality: quarantine bad records early (expectations)
Scalability & recovery: checkpoints, replay, backfills
Latency vs cost: micro-batch vs continuous triggers
Governance: Unity Catalog, lineage, access controls
Typical Databricks pattern
Bronze: Event ingestion using Auto Loader / streaming, raw append
Silver: Apply CDC using Delta Live Tables (apply_changes) or Delta CDF
Gold: Incremental aggregates / serving tables (streaming or triggered)
Recommended Databricks documentation
Medallion architecture:
https://docs.databricks.com/lakehouse/medallion.htmlAuto Loader & streaming pipelines (Lakeflow / DLT):
https://docs.databricks.com/ldp/tutorial-pipelines.htmlDelta Lake Change Data Feed (CDF):
https://docs.databricks.com/delta/delta-change-data-feed.htmlCDC with Delta Live Tables (apply_changes):
https://www.databricks.com/blog/2022/04/25/simplifying-change-data-capture-with-databricks-delta-liv...Lakeflow / DLT CDC APIs:
https://docs.databricks.com/ldp/cdc.htmlData quality & expectations:
https://docs.databricks.com/ldp/expectations.html
This combination gives you event-driven ingestion, scalable CDC, built-in data quality, and recoverability while staying fully aligned with Databricks best practices.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-10-2026 05:41 AM
Thanks @bianca_unifeye ..let me collect all these points and come back at the time of design..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-10-2026 03:43 PM
Hi Mey,
Please also consider databrick file arrival trigger for your event driven data ingestion journey.
https://docs.databricks.com/aws/en/jobs/file-arrival-triggers
Regards, Kartik