cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to merge delta data..

Krishscientist
New Contributor III

Data from Parquet to delta converted and delta files written into diff folders based on SRC_SYS_ID....

Any one help me how to merge delta data from multiple folders.

Regards.

2 REPLIES 2

Aashita
Databricks Employee
Databricks Employee

@Krishna Kommineni​ ,

MERGE command allows you to perform “upserts”, which are a mix of an UPDATE and an INSERT

To understand upserts, imagine that you have an existing table (a.k.a. a target table), and a source table that contains a mix of new records and updates to existing records. Here’s how an upsert works:

  • When a record from the source table matches a preexisting record in the target table, Delta Lake updates the record.
  • When there is no such match, Delta Lake inserts the new record.

Example code-

MERGE INTO events
USING updates
    ON events.eventId = updates.eventId
    WHEN MATCHED THEN UPDATE
        SET events.data = updates.data
    WHEN NOT MATCHED THEN 
        INSERT (date, eventId, data) VALUES (date, eventId, data)

Noopur_Nigam
Databricks Employee
Databricks Employee

Hi @Krishna Kommineni​ Is the table partitioned on SRC_SYS_ID col?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group