cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

UnsupportedOperationException: Cannot perform Merge as multiple source rows matched and attempted to modify the same

Prashant777
New Contributor II

My Code:--

CREATE OR REPLACE TEMPORARY VIEW preprocessed_source AS

SELECT

  Key_ID,

  Distributor_ID,

  Customer_ID,

  Customer_Name,

  Channel

FROM integr_masterdata.Customer_Master;

-- Step 2: Perform the merge operation using the preprocessed source table

MERGE INTO slvr_masterdata.Customer_Master as Target

USING preprocessed_source AS Source

ON

Source.Key_ID = Target.Key_ID

WHEN MATCHED THEN

UPDATE SET

Target.Distributor_ID = Source.Distributor_ID,

Target.Customer_ID = Source.Customer_ID,

Target.Customer_Name = Source.Customer_Name,

Target.Channel = Source.Channel,

Target.Time_Stamp = current_timestamp()

WHEN NOT MATCHED

  THEN INSERT

  (

  Distributor_ID,

  Customer_ID,

  Customer_Name,

  Channel,

  Time_Stamp

  )

  VALUES (

  Source.Distributor_ID,

  Source.Customer_ID,

  Source.Customer_Name,

  Source.Channel,

  current_timestamp()

)

3 REPLIES 3

daniel_sahal
Esteemed Contributor

@Prashant Joshiโ€‹ 

Since you're doing merge based on Source.Key_ID = Target.Key_ID, you need to ensure that Key_ID is unique in your source table. Otherwise it will throw an error as it cannot determine which row it should update.

Anonymous
Not applicable

Hi @Prashant Joshiโ€‹ 

Hope everything is going great.

Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you. 

Cheers!

LokeshManne
New Contributor III

@Prashant777 

In your scenario at update section, you are trying to update primary keys aswell, which Delta Table can't differentiate when you re-run the same batch/file and throws error as all duplicates, to run without error/fail, remove (Target.Distributor_ID = Source.Distributor_ID, Target.Customer_ID = Source.Customer_ID) columns from UPDATE clause. Updated code below.

 

 

MERGE INTO slvr_masterdata.Customer_Master as Target

USING preprocessed_source AS Source

ON Source.Key_ID = Target.Key_ID

WHEN MATCHED THEN

UPDATE SET

       Target.Customer_Name = Source.Customer_Name,

       Target.Channel = Source.Channel,

       Target.Time_Stamp = current_timestamp()

WHEN NOT MATCHED

  THEN INSERT

  (

        Distributor_ID,

         Customer_ID,

         Customer_Name,

        Channel,

        Time_Stamp

  )

  VALUES (

       Source.Distributor_ID,

      Source.Customer_ID,

      Source.Customer_Name,

      Source.Channel,

      current_timestamp()

)

 

Lokesh Manne