How to merge delta data..

Krishscientist
New Contributor III

Data from Parquet to delta converted and delta files written into diff folders based on SRC_SYS_ID....

Any one help me how to merge delta data from multiple folders.

Regards.

Aashita
Databricks Employee
Databricks Employee

@Krishna Kommineni​ ,

MERGE command allows you to perform “upserts”, which are a mix of an UPDATE and an INSERT

To understand upserts, imagine that you have an existing table (a.k.a. a target table), and a source table that contains a mix of new records and updates to existing records. Here’s how an upsert works:

  • When a record from the source table matches a preexisting record in the target table, Delta Lake updates the record.
  • When there is no such match, Delta Lake inserts the new record.

Example code-

MERGE INTO events
USING updates
    ON events.eventId = updates.eventId
    WHEN MATCHED THEN UPDATE
        SET events.data = updates.data
    WHEN NOT MATCHED THEN 
        INSERT (date, eventId, data) VALUES (date, eventId, data)

Noopur_Nigam
Databricks Employee
Databricks Employee

Hi @Krishna Kommineni​ Is the table partitioned on SRC_SYS_ID col?