cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Adding new columns to a Delta Live table in a CDC process

ameya
New Contributor

Hi 

I am new to databricks and still learning.

I am trying to do a CDC on a table.

 

 

APPLY CHANGES INTO LIVE.table1
FROM schema2.table2
KEYS (Id)
SEQUENCE BY orderByColumn
COLUMNS * EXCEPT (col1, col2)
  STORED AS SCD TYPE 1 ;

 

 

table1 is in schema1 and is silver layer. table2 is in schema2 and is bronze layer.

  1. table2 may have different, new columns than table1. how do I insert or update these new columns into table1. I am assuming that would give us schema mismatch errors.
  2. I want to insert some new columns based on the whether the operation is an insert operation or an update operation. If it is an update operation, I want to insert a column called 'LastModifiedDate' with the current timestamp and if it is an insert operation I want to insert a column called 'CreatedDate' as current timestamp.

How can I do these 2 things?

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

raphaelblg
Databricks Employee
Databricks Employee

Hi @ameya ,

Scenario 1: Enabling Delta schema evolution in your table or at DLT pipeline level should suffice for the scenario of new fields being added to the schema. 

Scenario 2: The INSERT statement doesn't support schema evolution as described in Delta schema evolution. You can explore the MERGE INTO statement for INSERT/UPDATE scenarios, merges will support schema evolution but I believe there is no way to set different schemas based on the underlying operation.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

View solution in original post

1 REPLY 1

raphaelblg
Databricks Employee
Databricks Employee

Hi @ameya ,

Scenario 1: Enabling Delta schema evolution in your table or at DLT pipeline level should suffice for the scenario of new fields being added to the schema. 

Scenario 2: The INSERT statement doesn't support schema evolution as described in Delta schema evolution. You can explore the MERGE INTO statement for INSERT/UPDATE scenarios, merges will support schema evolution but I believe there is no way to set different schemas based on the underlying operation.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group