cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Change data feed from target tables of APPLY CHANGES

afk
New Contributor III

Up until yesterday I was (sort of) able to read changes from target tables of apply changes operations (either through tables_changes() or using readChangeFeed). I say sort of because the meta columns (_change_type, _commit_version, _commit_timestamp) were missing from readChangeFeed but since there were no deletes on my source table I could read upserts just fine. I'm using a pattern similar to this message on stackoverflow: Databricks Delta Live Tables - Apply Changes from delta table

As of today I'm getting 

 

AnalysisException: [STREAMING_TABLE_OPERATION_NOT_ALLOWED.APPLY_CHANGES_UNSUPPORTED_OPERATION] The operation SELECT CHANGE DATA FEED is not allowed: The operation is not supported on APPLY CHANGES Streaming Tables.

 

I guess I should have taken the fact that meta columns were missing as indication that I was doing something wrong. How do you read changes from APPLY CHANGES Streaming Tables?

2 REPLIES 2

afk
New Contributor III

Thanks a lot @Retired_mod for providing more context for my question. Yes, I've been using CDC in a DLT pipeline and was reading changes to a table from one stage (say, bronze) and applying those to the next stage (say, silver) using APPLY CHANGES INTO. So the target of an APPLY CHANGES INTO operation becomes the source for a change data feed (readChangeFeed) for the next stage. Your post quotes the documentation (which I was also familiar with) in pointing out that "The target of the APPLY CHANGES INTO query cannot be used as a source for a streaming table" so I was using a materialized view to read from that target as recommended.

The error I've been getting since Friday (see my first post, it's a new error, the pipeline was working before) seems to indicate I can't read change data using CDC from a target table of an APPLY CHANGES INTO operation anymore. Am I reading this right?

If so, is an APPLY CHANGES INTO supposed to be the last operation in every pipeline (applied only to my gold tables, essentially)? And if that's the case, what's best practice with propagating changes across stages using DLT?

 

afk
New Contributor III

Thanks @Retired_mod for your response. I'm still stuck. How do I "apply [changes] to the silver layer" in a DLT pipeline without using APPLY CHANGES INTO (which I'm reserving for my gold layer)? The only way I see is to revert to using MERGE INTO and foreachBatch.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group