cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Structured streaming in Databricks using delta table

JissMathew
New Contributor II

Hi everyone, I’m new to Databricks and exploring its features. I’m trying to implement Change Data Capture (CDC) from the bronze layer to the silver layer using streaming. Could anyone share sample code or reference materials for implementing CDC with streaming in Databricks? I’m also looking to better understand the concept of streaming in Databricks. Any guidance would be greatly appreciated 😇!!

6 REPLIES 6

Walter_C
Databricks Employee
Databricks Employee

I will suggest you to go through blog https://www.databricks.com/blog/2022/04/25/simplifying-change-data-capture-with-databricks-delta-liv... this will provide you with more details and few examples you can use

JissMathew
New Contributor II

Hi @Walter_C  ,               

I’m looking to implement streaming using Delta tables. While I understand that Delta Live Tables simplify this process, they are unfortunately not available to use in the free trial version. Could you help guide me on how to achieve streaming with Delta tables, or share any examples or resources for this approach? Thank you!

Mike_Szklarczyk
New Contributor III

Why you need to implement CDC from bronze to silver - that is strange.

Some time ago a kind person replied to me in a similar situation: 'Maybe you can more elaborate about your ground problem than asking about some solutian that you think is proper.' This is related to https://en.wikipedia.org/wiki/XY_problem

Do you need:

  1. process your data from bronze to silver in the streaming manner (using Sructured Streaming)
  2. process your data from bronze to silver using CDC (because in Bronze you have for example Delete operations on your data)
  3. process tour data from bronze to silver using CDC in the streaming manner

yeah , I need this case 

  1. process your data from bronze to silver in the streaming manner (using Sructured Streaming)

Mike_Szklarczyk
New Contributor III

Ok, so I recommend to familiar with this documents:
https://docs.databricks.com/en/structured-streaming/delta-lake.html#language-python 
https://docs.databricks.com/en/structured-streaming/tutorial.html

Here you can find some sample generic transformation between batch and streaming approach:

# Batch approach:
(spark.read
    .table("<table-name1>")
    .<some_transformations>
    .write
    .saveAsTable("<table-name3>")
)

# Streaming approach:
(spark.readStream
    .table("<table-name1>")
    .<some_transformations>
    .writeStream
    .trigger(availableNow=True)
    .option("checkpointLocation", "<checkpoint-path>")
    .saveAsTable("<table-name3>")
)

Good luck 🙂

Mike_Szklarczyk
New Contributor III

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group