<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic GOLD table slowed down at MERGE INTO in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8968#M4474</link>
    <description>&lt;P&gt;Howdy - I recently took a table FACT_TENDER and made it into a medalliona tyle TABLE to test performance since I suspected medallion would be quicker. &lt;/P&gt;&lt;P&gt;Key differences: &lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Both tables use bronze data&lt;/LI&gt;&lt;LI&gt;original has all logic in one long notebook&lt;UL&gt;&lt;LI&gt;MERGE INTO that updates/inserts records takes roughly 13-minutes&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Medallion table that reads in SILVER table and performs two JOINS &lt;UL&gt;&lt;LI&gt;MERGE INTO tha updates/inserts records takes about 2 hours...&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The SILVER table is quick for the medallion build, so I am at a loss here... I have tried optimziing for range joins, filtering out data, and ordering by but none of these have worked. Any thoughts? I can provide more detail here.&lt;/P&gt;</description>
    <pubDate>Tue, 21 Feb 2023 19:01:03 GMT</pubDate>
    <dc:creator>JRT5933</dc:creator>
    <dc:date>2023-02-21T19:01:03Z</dc:date>
    <item>
      <title>GOLD table slowed down at MERGE INTO</title>
      <link>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8968#M4474</link>
      <description>&lt;P&gt;Howdy - I recently took a table FACT_TENDER and made it into a medalliona tyle TABLE to test performance since I suspected medallion would be quicker. &lt;/P&gt;&lt;P&gt;Key differences: &lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Both tables use bronze data&lt;/LI&gt;&lt;LI&gt;original has all logic in one long notebook&lt;UL&gt;&lt;LI&gt;MERGE INTO that updates/inserts records takes roughly 13-minutes&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Medallion table that reads in SILVER table and performs two JOINS &lt;UL&gt;&lt;LI&gt;MERGE INTO tha updates/inserts records takes about 2 hours...&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The SILVER table is quick for the medallion build, so I am at a loss here... I have tried optimziing for range joins, filtering out data, and ordering by but none of these have worked. Any thoughts? I can provide more detail here.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2023 19:01:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8968#M4474</guid>
      <dc:creator>JRT5933</dc:creator>
      <dc:date>2023-02-21T19:01:03Z</dc:date>
    </item>
    <item>
      <title>Re: GOLD table slowed down at MERGE INTO</title>
      <link>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8969#M4475</link>
      <description>&lt;P&gt;Hi @Jaime Tirado​&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Please refer below blog that might help&amp;nbsp;you-&lt;/P&gt;&lt;P&gt;&lt;A href="https://kb.databricks.com/en_US/delta/delta-merge-into" alt="https://kb.databricks.com/en_US/delta/delta-merge-into" target="_blank"&gt;How to improve performance of Delta Lake MERGE INTO queries using partition pruning - Databricks&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 22 Feb 2023 07:13:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8969#M4475</guid>
      <dc:creator>Ajay-Pandey</dc:creator>
      <dc:date>2023-02-22T07:13:45Z</dc:date>
    </item>
    <item>
      <title>Re: GOLD table slowed down at MERGE INTO</title>
      <link>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8970#M4476</link>
      <description>&lt;P&gt;yes by referring this blog, you can have a much better understanding &lt;/P&gt;</description>
      <pubDate>Wed, 22 Feb 2023 10:06:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8970#M4476</guid>
      <dc:creator>Rishabh-Pandey</dc:creator>
      <dc:date>2023-02-22T10:06:49Z</dc:date>
    </item>
    <item>
      <title>Re: GOLD table slowed down at MERGE INTO</title>
      <link>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8971#M4477</link>
      <description>&lt;P&gt;I have seen this article and it is not particularly helpful to my case. I have a DELTA table so I cannot add a partition.&lt;/P&gt;</description>
      <pubDate>Wed, 22 Feb 2023 13:55:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8971#M4477</guid>
      <dc:creator>JRT5933</dc:creator>
      <dc:date>2023-02-22T13:55:15Z</dc:date>
    </item>
    <item>
      <title>Re: GOLD table slowed down at MERGE INTO</title>
      <link>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8972#M4478</link>
      <description>&lt;P&gt;I ended up instituing true and tried PARTITIONING and PRUNING methods to boost performance, which has succeeded.&lt;/P&gt;</description>
      <pubDate>Wed, 22 Feb 2023 15:30:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/gold-table-slowed-down-at-merge-into/m-p/8972#M4478</guid>
      <dc:creator>JRT5933</dc:creator>
      <dc:date>2023-02-22T15:30:37Z</dc:date>
    </item>
  </channel>
</rss>

