<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: whenNotMatchedBySourceUpdate ConcurrentAppendException Partition in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/whennotmatchedbysourceupdate-concurrentappendexception-partition/m-p/44294#M27627</link>
    <description>&lt;P&gt;&lt;A href="https://docs.databricks.com/en/delta/merge.html" target="_blank"&gt;https://docs.databricks.com/en/delta/merge.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;"By definition, whenNotMatchedBySource clauses do not have a source row to pull column values from, and so source columns can’t be referenced."&lt;/P&gt;&lt;P&gt;I found out that this was not functioning correctly in runtime 12.2 but works in 13.3.&lt;BR /&gt;So it must have been a fix in the Delta Lake version&lt;/P&gt;</description>
    <pubDate>Mon, 11 Sep 2023 08:09:46 GMT</pubDate>
    <dc:creator>marcuskw</dc:creator>
    <dc:date>2023-09-11T08:09:46Z</dc:date>
    <item>
      <title>whenNotMatchedBySourceUpdate ConcurrentAppendException Partition</title>
      <link>https://community.databricks.com/t5/data-engineering/whennotmatchedbysourceupdate-concurrentappendexception-partition/m-p/44097#M27602</link>
      <description>&lt;P&gt;ConcurrentAppendException requires a good partitioning strategy, here my logic works without fault for&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;"whenMatchedUpdate" and "&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN&gt;whenNotMatchedInsert" logic.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;SPAN&gt;When using "&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN&gt;whenNotMatchedBySourceUpdate" however it seems that the condition doesn't isolate the specific partition in the delta table.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;So when merging a dataframe with that logic we meet a&amp;nbsp;ConcurrentAppendException when running in parallel even though the table is set up with "partition" as a constraint.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; deltaTable.alias&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'t'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .merge&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;df.alias&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'c'&lt;/SPAN&gt;&lt;SPAN&gt;),&lt;/SPAN&gt; &lt;SPAN&gt;f" t.partition= '&lt;/SPAN&gt;&lt;SPAN&gt;{partition&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;' AND t.id= c.id"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .whenMatchedUpdate&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt; &lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt; =&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"t.id"&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt; &lt;SPAN&gt;"c.id"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;"t.name"&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt; &lt;SPAN&gt;"c.name"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;"t.partition"&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt; &lt;SPAN&gt;"c.partition"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;"t.flag"&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt; &lt;SPAN&gt;"c.flag"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .whenNotMatchedInsert&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt; values =&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"t.id"&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt; &lt;SPAN&gt;"c.id"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;"t.name"&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt; &lt;SPAN&gt;"c.name"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ,"t.partition": "c.partition"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ,"t.flag": "c.flag"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .whenNotMatchedBySourceUpdate&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt; condition = &lt;/SPAN&gt;&lt;SPAN&gt;f"t.partition= '&lt;/SPAN&gt;&lt;SPAN&gt;{partition&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;'"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt; &amp;nbsp;=&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;"t.flag"&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;SPAN&gt; F.lit&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .execute&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;)&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Have I misunderstood the merge syntax, possibly whenNotMatchedBySourceUpdate scans the whole table and ignores the condition?&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 08 Sep 2023 14:14:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/whennotmatchedbysourceupdate-concurrentappendexception-partition/m-p/44097#M27602</guid>
      <dc:creator>marcuskw</dc:creator>
      <dc:date>2023-09-08T14:14:10Z</dc:date>
    </item>
    <item>
      <title>Re: whenNotMatchedBySourceUpdate ConcurrentAppendException Partition</title>
      <link>https://community.databricks.com/t5/data-engineering/whennotmatchedbysourceupdate-concurrentappendexception-partition/m-p/44294#M27627</link>
      <description>&lt;P&gt;&lt;A href="https://docs.databricks.com/en/delta/merge.html" target="_blank"&gt;https://docs.databricks.com/en/delta/merge.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;"By definition, whenNotMatchedBySource clauses do not have a source row to pull column values from, and so source columns can’t be referenced."&lt;/P&gt;&lt;P&gt;I found out that this was not functioning correctly in runtime 12.2 but works in 13.3.&lt;BR /&gt;So it must have been a fix in the Delta Lake version&lt;/P&gt;</description>
      <pubDate>Mon, 11 Sep 2023 08:09:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/whennotmatchedbysourceupdate-concurrentappendexception-partition/m-p/44294#M27627</guid>
      <dc:creator>marcuskw</dc:creator>
      <dc:date>2023-09-11T08:09:46Z</dc:date>
    </item>
  </channel>
</rss>

