Databricks Community

ameya · ‎06-26-2024

Hi

I am new to databricks and still learning.

I am trying to do a CDC on a table.

APPLY CHANGES INTO LIVE.table1
FROM schema2.table2
KEYS (Id)
SEQUENCE BY orderByColumn
COLUMNS * EXCEPT (col1, col2)
  STORED AS SCD TYPE 1 ;

table1 is in schema1 and is silver layer. table2 is in schema2 and is bronze layer.

table2 may have different, new columns than table1. how do I insert or update these new columns into table1. I am assuming that would give us schema mismatch errors.
I want to insert some new columns based on the whether the operation is an insert operation or an update operation. If it is an update operation, I want to insert a column called 'LastModifiedDate' with the current timestamp and if it is an insert operation I want to insert a column called 'CreatedDate' as current timestamp.

How can I do these 2 things?

Thanks in advance.

raphaelblg · ‎07-01-2024

Hi @ameya ,

Scenario 1: Enabling Delta schema evolution in your table or at DLT pipeline level should suffice for the scenario of new fields being added to the schema.

Scenario 2: The INSERT statement doesn't support schema evolution as described in Delta schema evolution. You can explore the MERGE INTO statement for INSERT/UPDATE scenarios, merges will support schema evolution but I believe there is no way to set different schemas based on the underlying operation.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

View solution in original post

raphaelblg · ‎07-01-2024

Hi @ameya ,

Scenario 1: Enabling Delta schema evolution in your table or at DLT pipeline level should suffice for the scenario of new fields being added to the schema.

Scenario 2: The INSERT statement doesn't support schema evolution as described in Delta schema evolution. You can explore the MERGE INTO statement for INSERT/UPDATE scenarios, merges will support schema evolution but I believe there is no way to set different schemas based on the underlying operation.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

Databricks Community

Adding new columns to a Delta Live table in a CDC process

Connect with Databricks Users in Your Area

Submit your feedback and win a $50 gift card!

Share Your Feedback in Our Community Survey

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!