Databricks Community

Prashant777 · ‎05-15-2023

My code:- CREATE OR REPLACE TEMPORARY VIEW preprocessed_source AS

SELECT

Key_ID,

Distributor_ID,

Customer_ID,

Customer_Name,

Channel

FROM integr_masterdata.Customer_Master;

-- Step 2: Perform the merge operation using the preprocessed source table

MERGE INTO slvr_masterdata.Customer_Master as Target

USING preprocessed_source AS Source

ON

Source.Key_ID = Target.Key_ID

WHEN MATCHED THEN

UPDATE SET

Target.Distributor_ID = Source.Distributor_ID,

Target.Customer_ID = Source.Customer_ID,

Target.Customer_Name = Source.Customer_Name,

Target.Channel = Source.Channel,

Target.Time_Stamp = current_timestamp()

WHEN NOT MATCHED

THEN INSERT

(

Distributor_ID,

Customer_ID,

Customer_Name,

Channel,

Time_Stamp

)

VALUES (

Source.Distributor_ID,

Source.Customer_ID,

Source.Customer_Name,

Source.Channel,

current_timestamp()

)

-werners- · ‎05-15-2023

you have duplicates in your incoming data according to the join condition (Key_Id in this case).

The way to handle this is to get rid of the dups before you do the merge.

Prashant777 · ‎05-15-2023

Thankuu werners for your answer.. is i have to remove duplicate in same code?...can you provide me code?

-werners- · ‎05-15-2023

first you have to find out what the cause of the duplicates is.

it might be that you try to join on an incomplete key. In that case you have to change your join condition.

Or perhaps you can just do a dropduplicates/distinct.

I never use sql btw to prepare data, imo you lose a lot of flexibility.

Tread · ‎01-10-2024

Hey as previously stated you could drop the duplicates of the columns that contain the said duplicates(code you can find online pretty easily), I have had this problem myself and it came when creating a temporary view from a dataframe, the dataframe didnt include duplicates but the tempview did hope this helps

LokeshManne · ‎05-13-2025

The error occurs; when we try to update all the cells of target_data without a single updated record in source_data(updates_data) , to resolve this issue add a update_time column with current timestamp (or) make changes in at least one cell of streaming/batch/incremental data, so that the DeltaTable knows it's not a duplicate.

Lokesh Manne

LokeshManne · ‎05-13-2025

This error occurs; when we try to update all the cells of target_data without a single updated record in source_data(updates_data) , to resolve this issue add a update_time column with unix timestamp (or) make changes in at least one cell of streaming/batch/incremental data, so that the DeltaTable knows it's not a duplicate.

In your scenario when you re-run the notebook with current timestamp it picks only in hours and days not in seconds and minutes which makes the whole data as duplicate, since your ran within an hour or less then 60 minutes.

Lokesh Manne

Databricks Community

Error in SQL statement: UnsupportedOperationException: Cannot perform Merge as multiple source rows matched and attempted to modify the same

Join Us as a Local Community Builder!

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples

🌟 Community Pulse: Your Weekly Roundup! October 31 – November 06, 2025

BrickTalks: Serve intelligence from your Lakehouse to your Apps with Lakebase

Free Edition Hackathon

Level Up with Databricks Specialist Sessions