Adding new columns to a Delta Live table in a CDC process

ameya
New Contributor

Hi 

I am new to databricks and still learning.

I am trying to do a CDC on a table.

 

 

APPLY CHANGES INTO LIVE.table1
FROM schema2.table2
KEYS (Id)
SEQUENCE BY orderByColumn
COLUMNS * EXCEPT (col1, col2)
  STORED AS SCD TYPE 1 ;

 

 

table1 is in schema1 and is silver layer. table2 is in schema2 and is bronze layer.

  1. table2 may have different, new columns than table1. how do I insert or update these new columns into table1. I am assuming that would give us schema mismatch errors.
  2. I want to insert some new columns based on the whether the operation is an insert operation or an update operation. If it is an update operation, I want to insert a column called 'LastModifiedDate' with the current timestamp and if it is an insert operation I want to insert a column called 'CreatedDate' as current timestamp.

How can I do these 2 things?

 

Thanks in advance.

raphaelblg
Databricks Employee
Databricks Employee

Hi @ameya ,

Scenario 1: Enabling Delta schema evolution in your table or at DLT pipeline level should suffice for the scenario of new fields being added to the schema. 

Scenario 2: The INSERT statement doesn't support schema evolution as described in Delta schema evolution. You can explore the MERGE INTO statement for INSERT/UPDATE scenarios, merges will support schema evolution but I believe there is no way to set different schemas based on the underlying operation.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

View solution in original post