<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: DLT Merge tables into Delta in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/dlt-merge-tables-into-delta/m-p/102479#M41133</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/106294"&gt;@Alberto_Umana&lt;/a&gt;&amp;nbsp;Thank you for the quick reply. But how are we to use the above, this looks like structured streaming with CDF mode.&lt;/P&gt;&lt;P&gt;But currently our tables being in Unity catalog, finding the start version and end version is taking huge time as the tables are near real time data. So, we wanted to read the source using DLT instead of CDF mode. So, reading the source with DLT, how should we use the above option of ForEachBatch and Merge? Could you please guide?&lt;/P&gt;</description>
    <pubDate>Wed, 18 Dec 2024 13:12:26 GMT</pubDate>
    <dc:creator>JothyGanesan</dc:creator>
    <dc:date>2024-12-18T13:12:26Z</dc:date>
    <item>
      <title>DLT Merge tables into Delta</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-merge-tables-into-delta/m-p/102476#M41131</link>
      <description>&lt;P&gt;We are trying to load a Delta table from streaming tables using DLT. This target table needs a MERGE of 3 source tables. But when we use the DLT command with merge it says Merge is not supported. Is this anything related to DLT version? Please help us on this&lt;/P&gt;</description>
      <pubDate>Wed, 18 Dec 2024 12:57:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-merge-tables-into-delta/m-p/102476#M41131</guid>
      <dc:creator>JothyGanesan</dc:creator>
      <dc:date>2024-12-18T12:57:43Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Merge tables into Delta</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-merge-tables-into-delta/m-p/102478#M41132</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/134682"&gt;@JothyGanesan&lt;/a&gt;,&lt;/P&gt;
&lt;P class="p1"&gt;Delta Live Tables (DLT) currently does not support the MERGE operation directly within a DLT pipeline. This limitation is not related to the DLT version but rather a general restriction in the functionality of DLT.&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;However, you can achieve the desired outcome by using a combination of foreachBatch and MERGE within a streaming query. Here is an example of how you can use foreachBatch to perform a MERGE operation&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;from delta.tables import *&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;def upsert_to_delta(microBatchOutputDF, batchId):&lt;/P&gt;
&lt;P class="p1"&gt;&lt;SPAN class="Apple-converted-space"&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;deltaTable = DeltaTable.forName(spark, "target_table_name")&lt;/P&gt;
&lt;P class="p1"&gt;&lt;SPAN class="Apple-converted-space"&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;deltaTable.alias("t").merge(&lt;/P&gt;
&lt;P class="p1"&gt;&lt;SPAN class="Apple-converted-space"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;microBatchOutputDF.alias("s"),&lt;/P&gt;
&lt;P class="p1"&gt;&lt;SPAN class="Apple-converted-space"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;"s.key = t.key"&lt;/P&gt;
&lt;P class="p1"&gt;&lt;SPAN class="Apple-converted-space"&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;).whenMatchedUpdateAll().whenNotMatchedInsertAll().execute()&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;streamingDF.writeStream.foreachBatch(upsert_to_delta).outputMode("update").start()&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;In this example, upsert_to_delta is a function that performs the MERGE operation using the Delta Lake APIs. The foreachBatch method is used to apply this function to each micro-batch of the streaming DataFrame&lt;/P&gt;</description>
      <pubDate>Wed, 18 Dec 2024 13:03:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-merge-tables-into-delta/m-p/102478#M41132</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2024-12-18T13:03:45Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Merge tables into Delta</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-merge-tables-into-delta/m-p/102479#M41133</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/106294"&gt;@Alberto_Umana&lt;/a&gt;&amp;nbsp;Thank you for the quick reply. But how are we to use the above, this looks like structured streaming with CDF mode.&lt;/P&gt;&lt;P&gt;But currently our tables being in Unity catalog, finding the start version and end version is taking huge time as the tables are near real time data. So, we wanted to read the source using DLT instead of CDF mode. So, reading the source with DLT, how should we use the above option of ForEachBatch and Merge? Could you please guide?&lt;/P&gt;</description>
      <pubDate>Wed, 18 Dec 2024 13:12:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-merge-tables-into-delta/m-p/102479#M41133</guid>
      <dc:creator>JothyGanesan</dc:creator>
      <dc:date>2024-12-18T13:12:26Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Merge tables into Delta</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-merge-tables-into-delta/m-p/102993#M41292</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/134682"&gt;@JothyGanesan&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;Please take a look at the Apply Changes API -&amp;nbsp;&lt;A href="https://docs.databricks.com/en/delta-live-tables/cdc.html" target="_blank"&gt;https://docs.databricks.com/en/delta-live-tables/cdc.html&lt;/A&gt;&lt;BR /&gt;This is a replacement of MERGE INTO in Databricks.&lt;/P&gt;&lt;P&gt;Cheers!&lt;/P&gt;</description>
      <pubDate>Mon, 23 Dec 2024 10:36:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-merge-tables-into-delta/m-p/102993#M41292</guid>
      <dc:creator>RiyazAliM</dc:creator>
      <dc:date>2024-12-23T10:36:46Z</dc:date>
    </item>
  </channel>
</rss>

