iyashk-DB
Databricks Employee
Databricks Employee

Option 1: Strengthen the ON clause to uniquely identify the target row
If you intend to update only when the full key matches, make the ON clause include all keys:

MERGE INTO target_table AS target
USING source_table AS source
ON target.key1 = source.key1
AND target.key2 = source.key2
AND target.key3 = source.key3
WHEN MATCHED AND source.rn1 = 1 THEN
UPDATE SET val = source.val;

This guarantees at most one source row can match a given target row (assuming your keys are unique), preventing the ambiguity error.

Option 2: Preselect exactly one source row per target row (fallback priority)
If you want “priority” matching (exact triplet if present; else same key2; else any rn3), compute the single best source row per target.key1 first, then MERGE with an ON that still only joins by key1.

WITH candidates AS (
SELECT
t.key1,
t.key2,
t.key3,
s.val,
CASE
WHEN t.key2 = s.key2 AND t.key3 = s.key3 AND s.rn1 = 1 THEN 3
WHEN t.key2 = s.key2 AND s.rn2 = 1 THEN 2
WHEN s.rn3 = 1 THEN 1
ELSE 0
END AS priority
FROM target_table t
JOIN source_table s
ON t.key1 = s.key1
),
picked AS (
SELECT key1, val
FROM (
SELECT
key1, val, priority,
ROW_NUMBER() OVER (PARTITION BY key1 ORDER BY priority DESC) AS rnk
FROM candidates
WHERE priority > 0
) q
WHERE rnk = 1
)
MERGE INTO target_table AS target
USING picked AS source
ON target.key1 = source.key1
WHEN MATCHED THEN
UPDATE SET target.val = source.val;

This ensures there’s exactly one source row per target.key1 before MERGE, so MERGE can’t see multiple matches and won’t error.

View solution in original post