Sunday
I use databricks 17.3 runtime.
I try to run following code.
CREATE OR REPLACE TABLE default.target_table (
key1 INT,
key2 INT,
key3 INT,
val STRING
) USING DELTA;
INSERT INTO target_table(key1, key2, key3, val) VALUES
(1, 1, 1, 'a');
CREATE OR REPLACE TABLE default.source_table (
key1 INT,
key2 INT,
key3 INT,
val STRING,
rn1 INT,
rn2 INT,
rn3 INT
) USING DELTA;
INSERT INTO source_table(key1, key2, key3, val, rn1, rn2, rn3) VALUES
(1, 1, 1, 'a1', 1, 0, 0),
(1, 1, 2, 'a2', 0, 1, 0),
(1, 3, 3, 'a3', 0, 0, 1);
MERGE INTO target_table AS target
USING source_table AS source
ON target.key1 = source.key1
WHEN MATCHED AND target.key2 = source.key2 AND target.key3 = source.key3 AND source.rn1 = 1 THEN
UPDATE SET val = source.val
WHEN MATCHED AND target.key2 = source.key2 AND source.rn2 = 1 THEN
UPDATE SET val = source.val
WHEN MATCHED AND source.rn3 = 1 THEN
UPDATE SET val = source.val
;
and got following error:
[DELTA_MULTIPLE_SOURCE_ROW_MATCHING_TARGET_ROW_IN_MERGE] Cannot perform Merge as multiple source rows matched and attempted to modify the same target row in the Delta table in possibly conflicting ways. By SQL semantics of Merge, when multiple source rows match on the same target row, the result may be ambiguous as it is unclear which source row should be used to update or delete the matching target row. You can preprocess the source table to eliminate the possibility of multiple matches. Please refer to https://docs.microsoft.com/azure/databricks/delta/merge#merge-error SQLSTATE: 21506
which is a strange, because for each of “MATCHED AND” clause there is exactly only one row matching.
Above code I created to demonstrate the real-life problem I spotted.
yesterday
Option 1: Strengthen the ON clause to uniquely identify the target row
If you intend to update only when the full key matches, make the ON clause include all keys:
MERGE INTO target_table AS target
USING source_table AS source
ON target.key1 = source.key1
AND target.key2 = source.key2
AND target.key3 = source.key3
WHEN MATCHED AND source.rn1 = 1 THEN
UPDATE SET val = source.val;
This guarantees at most one source row can match a given target row (assuming your keys are unique), preventing the ambiguity error.
Option 2: Preselect exactly one source row per target row (fallback priority)
If you want “priority” matching (exact triplet if present; else same key2; else any rn3), compute the single best source row per target.key1 first, then MERGE with an ON that still only joins by key1.
WITH candidates AS (
SELECT
t.key1,
t.key2,
t.key3,
s.val,
CASE
WHEN t.key2 = s.key2 AND t.key3 = s.key3 AND s.rn1 = 1 THEN 3
WHEN t.key2 = s.key2 AND s.rn2 = 1 THEN 2
WHEN s.rn3 = 1 THEN 1
ELSE 0
END AS priority
FROM target_table t
JOIN source_table s
ON t.key1 = s.key1
),
picked AS (
SELECT key1, val
FROM (
SELECT
key1, val, priority,
ROW_NUMBER() OVER (PARTITION BY key1 ORDER BY priority DESC) AS rnk
FROM candidates
WHERE priority > 0
) q
WHERE rnk = 1
)
MERGE INTO target_table AS target
USING picked AS source
ON target.key1 = source.key1
WHEN MATCHED THEN
UPDATE SET target.val = source.val;
This ensures there’s exactly one source row per target.key1 before MERGE, so MERGE can’t see multiple matches and won’t error.
yesterday
This behaviour is expected with Delta MERGE in your runtime and is due to how duplicate matches are detected.
The error DELTA_MULTIPLE_SOURCE_ROW_MATCHING_TARGET_ROW_IN_MERGE occurs because Delta requires exactly one source row to match a target row during a MERGE. In DBR 16.0+ (including 17.3), Delta checks for duplicates using both the ON clause and the WHEN MATCHED conditions. So even if each WHEN clause seems to apply to a different row, if multiple source rows still match the same target row overall, the MERGE is considered ambiguous and fails.
In your case, the ON condition is only key1, and all source rows share key1 = 1. This causes all of them to match the same target row, triggering the error under the stricter semantics in newer runtimes.
To fix this safely, you have two options. Either strengthen the ON clause so it uniquely identifies the target row (for example, include all key columns), or preprocess the source data to ensure only one source row exists per target row before running MERGE (for example, by ranking and picking a single “best” row).
The key takeaway is that MERGE must be unambiguous. In newer DBRs, separating logic into multiple WHEN clauses is no longer enough—your source data or join condition must guarantee a single match per target row.
yesterday
Option 1: Strengthen the ON clause to uniquely identify the target row
If you intend to update only when the full key matches, make the ON clause include all keys:
MERGE INTO target_table AS target
USING source_table AS source
ON target.key1 = source.key1
AND target.key2 = source.key2
AND target.key3 = source.key3
WHEN MATCHED AND source.rn1 = 1 THEN
UPDATE SET val = source.val;
This guarantees at most one source row can match a given target row (assuming your keys are unique), preventing the ambiguity error.
Option 2: Preselect exactly one source row per target row (fallback priority)
If you want “priority” matching (exact triplet if present; else same key2; else any rn3), compute the single best source row per target.key1 first, then MERGE with an ON that still only joins by key1.
WITH candidates AS (
SELECT
t.key1,
t.key2,
t.key3,
s.val,
CASE
WHEN t.key2 = s.key2 AND t.key3 = s.key3 AND s.rn1 = 1 THEN 3
WHEN t.key2 = s.key2 AND s.rn2 = 1 THEN 2
WHEN s.rn3 = 1 THEN 1
ELSE 0
END AS priority
FROM target_table t
JOIN source_table s
ON t.key1 = s.key1
),
picked AS (
SELECT key1, val
FROM (
SELECT
key1, val, priority,
ROW_NUMBER() OVER (PARTITION BY key1 ORDER BY priority DESC) AS rnk
FROM candidates
WHERE priority > 0
) q
WHERE rnk = 1
)
MERGE INTO target_table AS target
USING picked AS source
ON target.key1 = source.key1
WHEN MATCHED THEN
UPDATE SET target.val = source.val;
This ensures there’s exactly one source row per target.key1 before MERGE, so MERGE can’t see multiple matches and won’t error.
yesterday
Thank you @iyashk-DB for explanation and solution to my my problem.
Option 2 is what I need.
yesterday
Hi, you need to put all of the keys in the ON part of the clause rather then in the where condition. This code works:
MERGE INTO target_table AS target
USING source_table AS source
ON target.key1 = source.key1
AND target.key2 = source.key2
AND target.key3 = source.key3
WHEN MATCHED AND source.rn1 = 1 THEN
UPDATE SET val = source.val
WHEN MATCHED AND source.rn2 = 1 THEN
UPDATE SET val = source.val
WHEN MATCHED AND source.rn3 = 1 THEN
UPDATE SET val = source.val
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now