cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Change data feed for tables with allowColumnDefaults property "enabled"

n_joy
Contributor

I have a Delta table already created, with both enabled the #enableChangeDataFeed option and #allowColumnDefaults properties. However when writing to CDC table with streaming queries it fails with the following error [CREATE TABLE command because it assigned a column DEFAULT value, but corresponding table feature was not enabled. Please retry command again after executing ALTER TABLE table_name SET TBLPROPERTIES(#'delta.feature.allowColumnDefaults' = 'supported')

My understanding is that this CREATE TABLE is inside the box of

#enableChangeDataFeed feature, since adding metadata columns (https://docs.delta.io/latest/delta-change-data-feed.html#what-is-the-schema-for-the-change-data-feed) , so I was wondering is there a way to set required for this case properties (#'delta.feature.allowColumnDefaults' = 'supported')when reading and writing with Streaming queries.
I mean how to resolve missing property before streaming query starts if I am not creating or running CREATE TABLE statement for CDC table myself?
3 ACCEPTED SOLUTIONS

Accepted Solutions

n_joy
Contributor

Case is the following: "For some reason, the Change Data Feed feature recognizes DEFAULT columns from the source but does not enable this property when creating a _cdc table.": 

n_joy_0-1725523337664.png

So it uses the following DDL that is impossible to execute: 

n_joy_2-1725523954329.png
Instead of this one: 

n_joy_4-1725524553528.png

As a workaround , prior to run streaming queries I create a cdc table myself

n_joy_3-1725524454093.png

I was just wondering whether there is any other way without explicitly running DDL for _cdc table myself. 

View solution in original post

filipniziol
Contributor III

Hi @n_joy ,

Based on the docs it will not be possible.

https://spark.apache.org/docs/latest/api/python/reference/pyspark.ss/api/pyspark.sql.streaming.DataS...

If you check toTable definition, it suggest the table is already in place ("output results to the given table").
I think the fact that the table is is being created is anyway pretty cool. 

Neither delta lake documentation you shared nor the Apache Spark documentation offer any parameter that enable configuring table properties. 

The solution here is to have CREATE script for a table, specify the table properties and then to run streaming query to insert the data.

View solution in original post

n_joy
Contributor

@filipniziol
Yes, that is what I do ๐Ÿ™‚ 
Thanks for feedback ! 

View solution in original post

6 REPLIES 6

n_joy
Contributor

ERROR:[WRONG_COLUMN_DEFAULTS_FOR_DELTA_FEATURE_NOT_ENABLED] Failed to execute CREATE TABLE command because it assigned a column DEFAULT value, but corresponding table feature was not enabled. Please retry command again after executing ALTER TABLE table_name SET TBLPROPERTIES(#'delta.feature.allowColumnDefaults' = 'supported')

filipniziol
Contributor III

Hi @n_joy ,
what is your limitation here?
You cannot run the statement mentioned in the error message?

 

ALTER TABLE table_name 
SET TBLPROPERTIES('delta.feature.allowColumnDefaults' = 'supported')

 

Do I understand correctly that the table is created, you do not have permissions to run ALTER statement, so you want to somehow change this property with steaming query?

If yes, it is not possible.

Streaming queries that insert data are DML (Data Manipulation Language) statements, which are used to modify the data within the table (e.g., INSERT, UPDATE).

ALTER statements are DDL (Data Definition Language) statements, which are used to modify the structure or properties of the table itself.

DML and DDL statements cannot be mixed in a streaming query. This is a fundamental limitation in SQL operations, meaning you cannot modify table properties while performing data insertions through streaming.

Additionally, even creating a table with DEFAULT without 'delta.feature.allowColumnDefaults' property fails.

filipniziol_0-1725520518981.png

To sum up, the only way to make it work is to run the ALTER command and set 'delta.feature.allowColumnDefaults' = 'supported'.

n_joy
Contributor

Case is the following: "For some reason, the Change Data Feed feature recognizes DEFAULT columns from the source but does not enable this property when creating a _cdc table.": 

n_joy_0-1725523337664.png

So it uses the following DDL that is impossible to execute: 

n_joy_2-1725523954329.png
Instead of this one: 

n_joy_4-1725524553528.png

As a workaround , prior to run streaming queries I create a cdc table myself

n_joy_3-1725524454093.png

I was just wondering whether there is any other way without explicitly running DDL for _cdc table myself. 

n_joy
Contributor

@filipniziol
Hi! Thank you for your response! Please see my comments above. 

filipniziol
Contributor III

Hi @n_joy ,

Based on the docs it will not be possible.

https://spark.apache.org/docs/latest/api/python/reference/pyspark.ss/api/pyspark.sql.streaming.DataS...

If you check toTable definition, it suggest the table is already in place ("output results to the given table").
I think the fact that the table is is being created is anyway pretty cool. 

Neither delta lake documentation you shared nor the Apache Spark documentation offer any parameter that enable configuring table properties. 

The solution here is to have CREATE script for a table, specify the table properties and then to run streaming query to insert the data.

n_joy
Contributor

@filipniziol
Yes, that is what I do ๐Ÿ™‚ 
Thanks for feedback ! 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group