<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to handle exploded records with overwrite-by-key logic in Delta Live Tables in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-handle-exploded-records-with-overwrite-by-key-logic-in/m-p/123117#M46947</link>
    <description>&lt;P&gt;Hi&amp;nbsp;Dewlap,&lt;/P&gt;&lt;P&gt;How are you doing today? As per my understanding,&amp;nbsp;You're right to notice that apply_changes in DLT works best for one-row-per-key updates and doesn't fit well when you need to replace multiple rows for the same key, especially after exploding arrays. A good workaround is to break this into two steps: first, write your transformed and exploded records into a staging table or view; then, in the next step, delete all existing records from the target table that match the incoming keys and insert the new rows. This "overwrite-by-key" pattern gives you better control and is more suited for your use case. While DLT doesn’t have a native function for this, handling it in a custom logic with a staging layer (especially in a silver-to-gold setup) is a practical and often-used solution. If you're running in streaming mode, you might also consider using a foreachBatch block to manage the deletes and inserts safely. Let me know if you'd like help with a code example.&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Brahma&lt;/P&gt;</description>
    <pubDate>Fri, 27 Jun 2025 22:07:23 GMT</pubDate>
    <dc:creator>Brahmareddy</dc:creator>
    <dc:date>2025-06-27T22:07:23Z</dc:date>
    <item>
      <title>How to handle exploded records with overwrite-by-key logic in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-handle-exploded-records-with-overwrite-by-key-logic-in/m-p/123116#M46946</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm using Delta Live Tables (DLT) with the apply_changes API to manage SCD Type 1 on a source table. However, I’ve run into a limitation.&lt;/P&gt;&lt;H3&gt;Context:&lt;/H3&gt;&lt;P&gt;After apply_changes, I have a derived view that:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;Flattens and &lt;STRONG&gt;explodes a JSON array field&lt;/STRONG&gt; in the source data.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Produces &lt;STRONG&gt;multiple rows per original primary key&lt;/STRONG&gt; (key_id).&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Needs to be written to a &lt;STRONG&gt;destination Delta table&lt;/STRONG&gt;, such that:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;If records already exist for a key_id, they should be &lt;STRONG&gt;fully replaced&lt;/STRONG&gt; (delete existing rows and insert the new exploded ones).&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;Challenge:&lt;/H3&gt;&lt;P&gt;apply_changes processes &lt;STRONG&gt;one record at a time&lt;/STRONG&gt;, which doesn’t suit my case, where &lt;STRONG&gt;multiple rows need to be updated for one incoming key&lt;/STRONG&gt;. This makes it hard to cleanly replace existing groups of records in the target.&lt;/P&gt;&lt;H3&gt;Question:&lt;/H3&gt;&lt;P&gt;&lt;STRONG&gt;Is there a native way (or recommended pattern) in Databricks/DLT to support this kind of "overwrite-by-key" logic&lt;/STRONG&gt;, where we delete all rows by key and insert multiple transformed rows, preferably in a streaming or near-real-time pipeline?&lt;/P&gt;</description>
      <pubDate>Fri, 27 Jun 2025 21:39:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-handle-exploded-records-with-overwrite-by-key-logic-in/m-p/123116#M46946</guid>
      <dc:creator>Dewlap</dc:creator>
      <dc:date>2025-06-27T21:39:30Z</dc:date>
    </item>
    <item>
      <title>Re: How to handle exploded records with overwrite-by-key logic in Delta Live Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-handle-exploded-records-with-overwrite-by-key-logic-in/m-p/123117#M46947</link>
      <description>&lt;P&gt;Hi&amp;nbsp;Dewlap,&lt;/P&gt;&lt;P&gt;How are you doing today? As per my understanding,&amp;nbsp;You're right to notice that apply_changes in DLT works best for one-row-per-key updates and doesn't fit well when you need to replace multiple rows for the same key, especially after exploding arrays. A good workaround is to break this into two steps: first, write your transformed and exploded records into a staging table or view; then, in the next step, delete all existing records from the target table that match the incoming keys and insert the new rows. This "overwrite-by-key" pattern gives you better control and is more suited for your use case. While DLT doesn’t have a native function for this, handling it in a custom logic with a staging layer (especially in a silver-to-gold setup) is a practical and often-used solution. If you're running in streaming mode, you might also consider using a foreachBatch block to manage the deletes and inserts safely. Let me know if you'd like help with a code example.&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Brahma&lt;/P&gt;</description>
      <pubDate>Fri, 27 Jun 2025 22:07:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-handle-exploded-records-with-overwrite-by-key-logic-in/m-p/123117#M46947</guid>
      <dc:creator>Brahmareddy</dc:creator>
      <dc:date>2025-06-27T22:07:23Z</dc:date>
    </item>
  </channel>
</rss>

