<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Delete from delta table in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/delete-from-delta-table/m-p/29017#M20774</link>
    <description>&lt;P&gt;Thanks!&lt;/P&gt;</description>
    <pubDate>Tue, 15 Feb 2022 11:37:16 GMT</pubDate>
    <dc:creator>BorislavBlagoev</dc:creator>
    <dc:date>2022-02-15T11:37:16Z</dc:date>
    <item>
      <title>Delete from delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/delete-from-delta-table/m-p/29015#M20772</link>
      <description>&lt;P&gt;What is the best way to delete from the delta table? In my case, I want to read a table from the MySQL database (without a soft delete column) and then store that table in Azure as a Delta table. When the ids are equal I will update the Delta table when they are not I will insert records into the Delta table, but how to perform the delete. The logic for the deleting is when the Delta table contains id but this id is not in the new data (MySQL table)?&lt;/P&gt;</description>
      <pubDate>Wed, 09 Feb 2022 15:47:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delete-from-delta-table/m-p/29015#M20772</guid>
      <dc:creator>BorislavBlagoev</dc:creator>
      <dc:date>2022-02-09T15:47:51Z</dc:date>
    </item>
    <item>
      <title>Re: Delete from delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/delete-from-delta-table/m-p/29017#M20774</link>
      <description>&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Tue, 15 Feb 2022 11:37:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delete-from-delta-table/m-p/29017#M20774</guid>
      <dc:creator>BorislavBlagoev</dc:creator>
      <dc:date>2022-02-15T11:37:16Z</dc:date>
    </item>
    <item>
      <title>Re: Delete from delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/delete-from-delta-table/m-p/29019#M20776</link>
      <description>&lt;P&gt;Hi @Borislav Blagoev​&amp;nbsp; Could explain a bit more? What is the logic for inserting the records? When ids are not equal what does it mean? Also please check &lt;A href="https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html" alt="https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html" target="_blank"&gt;delta-merge-into&lt;/A&gt; for updating, inserting, and deleting using the merge command. &lt;/P&gt;</description>
      <pubDate>Wed, 09 Mar 2022 01:48:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delete-from-delta-table/m-p/29019#M20776</guid>
      <dc:creator>User16763506477</dc:creator>
      <dc:date>2022-03-09T01:48:24Z</dc:date>
    </item>
    <item>
      <title>Re: Delete from delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/delete-from-delta-table/m-p/29020#M20777</link>
      <description>&lt;P&gt;Hi have the similar issue, I don't see the solution is provided here. I want to perform upcert operation. But along with upcert, I want to delete the records which are missing in source table, but present in the target table. You can think it as a master data update. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Source table contains a full set of master data. This is the latest incoming data.&lt;/LI&gt;&lt;LI&gt;Target table contains the full set of master data. This is old data present in the current database.&lt;/LI&gt;&lt;LI&gt;The source may contains new records, updates to some existing records, or some records might have removed compared to the target.&lt;/LI&gt;&lt;LI&gt;So during MERGE operation, I want to update the matching records in target from source, Insert new incoming records from the source, and delete the records from target which were not present in the source. How to achieve this with databricks MERGE?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I see that the similar operation is possible in &lt;A href="https://www.mssqltips.com/sqlservertip/1704/using-merge-in-sql-server-to-insert-update-and-delete-at-the-same-time/" alt="https://www.mssqltips.com/sqlservertip/1704/using-merge-in-sql-server-to-insert-update-and-delete-at-the-same-time/" target="_blank"&gt;MYSQL Server&lt;/A&gt;, as shown below. But Databricks SQL doesn't support "BY SOURCE" option.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;--Synchronize the target table with refreshed data from source table&lt;/P&gt;&lt;P&gt;MERGE Products AS TARGET&lt;/P&gt;&lt;P&gt;USING UpdatedProducts AS SOURCE &lt;/P&gt;&lt;P&gt;ON (TARGET.ProductID = SOURCE.ProductID) &lt;/P&gt;&lt;P&gt;--When records are matched, update the records if there is any change&lt;/P&gt;&lt;P&gt;WHEN MATCHED AND TARGET.ProductName &amp;lt;&amp;gt; SOURCE.ProductName OR TARGET.Rate &amp;lt;&amp;gt; SOURCE.Rate &lt;/P&gt;&lt;P&gt;THEN UPDATE SET TARGET.ProductName = SOURCE.ProductName, TARGET.Rate = SOURCE.Rate &lt;/P&gt;&lt;P&gt;--When no records are matched, insert the incoming records from source table to target table&lt;/P&gt;&lt;P&gt;WHEN NOT MATCHED BY TARGET &lt;/P&gt;&lt;P&gt;THEN INSERT (ProductID, ProductName, Rate) VALUES (SOURCE.ProductID, SOURCE.ProductName, SOURCE.Rate)&lt;/P&gt;&lt;P&gt;--When there is a row that exists in target and same record does not exist in source then delete this record target&lt;/P&gt;&lt;P&gt;WHEN NOT MATCHED BY SOURCE &lt;/P&gt;&lt;P&gt;THEN DELETE &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Krishna&lt;/P&gt;</description>
      <pubDate>Tue, 24 May 2022 15:11:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delete-from-delta-table/m-p/29020#M20777</guid>
      <dc:creator>Krish-685291</dc:creator>
      <dc:date>2022-05-24T15:11:34Z</dc:date>
    </item>
  </channel>
</rss>

