cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results for 
Search instead for 
Did you mean: 

Change Data Feed Cost

ArjunGopinath96
New Contributor

Greetings,

I want to understand the efficiency of using Change Data Feed in tracking the changes of a table that has around 1 million rows. There will be around 20K appends in a week.  I read that to track appends CDF is not the right way-if thats true, I want to know the alternative to track appends.  Also if CDF is indeed the right way to track these changes- i want to understand the cost involved. Its just mentioned that this will have a small storage cost. 

Thank You

1 REPLY 1

Ravivarma
Databricks Employee
Databricks Employee

Hello @ArjunGopinath96 ,

Greetings!

Change Data Feed (CDF) in Delta Lake provides an efficient way to track changes in a table, including appends. It works by recording row-level changes between versions of a Delta table, capturing both the row data and metadata to indicate whether a row was inserted, deleted, or updated. However, please note that CDF is forward-looking and only records changes that occur after it is enabled.

CDF is capable of handling a table with around 1 million rows and approximately 20,000 appends per week. If your primary interest is in tracking appends, you might want to consider using the "APPLY CHANGES API" in Delta Live Tables. This API simplifies change data capture (CDC) and can be used to directly update records while retaining history for updated records.

Regarding costs, enabling CDF does lead to a slight increase in storage costs for a table. The change data records are generated as the query runs and are generally much smaller than the total size of rewritten files. The exact cost would depend on your specific usage and the pricing details of your Databricks plan.

Docs:

https://docs.databricks.com/en/delta/delta-change-data-feed.html

https://docs.databricks.com/en/delta-live-tables/cdc.html

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group