cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

CDC / Event Driven Data Ingestion

Mey
New Contributor II

Hello Guys,

I am planning to implement Event Driven Data Ingestion from Bronze -> Silver -> Gold layer in my project. Currently we are having batch processing approach for our data ingestion pipelines. We have decided to move away from batch process to Event Driven approach. Can some one guide me / throw some light on architectural design, steps / key factors I have to capture before design the CDC / Event Driven Scalable architecture. Also it would be good if you provide some examples or documents as a sample for my verification.

Thanks for all the helps so far!!!

2 ACCEPTED SOLUTIONS

Accepted Solutions

saurabh18cs
Honored Contributor III

Hi @Mey I would suggest you to use Lakeflow suite(lakeflow connect , lakeflow declaratve pipelines, lakeflow jobs) pipelines from databricks to achieve this event driven incremental workload.Below example is using SQL but you can also use python. read_files is an example of using databricks natively support autoloader for incremental streaming.

example workflow:

saurabh18cs_0-1770717443254.pngsaurabh18cs_1-1770717633529.png

saurabh18cs_2-1770718093899.png

 

 

 

View solution in original post

Advika
Community Manager
Community Manager

Hello @Mey,
This looks similar to your other post, where a solution was already accepted. To avoid confusion, letโ€™s please continue the conversation in that thread.

View solution in original post

2 REPLIES 2

saurabh18cs
Honored Contributor III

Hi @Mey I would suggest you to use Lakeflow suite(lakeflow connect , lakeflow declaratve pipelines, lakeflow jobs) pipelines from databricks to achieve this event driven incremental workload.Below example is using SQL but you can also use python. read_files is an example of using databricks natively support autoloader for incremental streaming.

example workflow:

saurabh18cs_0-1770717443254.pngsaurabh18cs_1-1770717633529.png

saurabh18cs_2-1770718093899.png

 

 

 

Advika
Community Manager
Community Manager

Hello @Mey,
This looks similar to your other post, where a solution was already accepted. To avoid confusion, letโ€™s please continue the conversation in that thread.