<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Medallion architecture, how to update Gold tables? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8153#M3864</link>
    <description>&lt;P&gt;When there is aggregated value in the Gold table, I think it should be calculated using all records in the Silver table. &lt;/P&gt;</description>
    <pubDate>Tue, 07 Mar 2023 21:24:57 GMT</pubDate>
    <dc:creator>Mado</dc:creator>
    <dc:date>2023-03-07T21:24:57Z</dc:date>
    <item>
      <title>Medallion architecture, how to update Gold tables?</title>
      <link>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8151#M3862</link>
      <description>&lt;P&gt;Assume that I have a data source that is ingested to a few bronze tables, and transformed to a silver table. Ans next, a gold table is created by aggregating the silver table. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If new records arrive in the data source, bronze and silver tables are updated by appending new records. Since the gold table contains aggregated values, using "append" is meaningless.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'd like to know which approach is recommended to update gold tables in case of having a large dataset:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1) Drop the current gold table, and re-create it&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;2) Overwrite the gold table&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The first option seems to be slower if we deal with a large dataset. However, I'd like to know if there is any risk with option 2 (e.g. if the table is not overwritten correctly). &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2023 12:30:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8151#M3862</guid>
      <dc:creator>Mado</dc:creator>
      <dc:date>2023-03-07T12:30:25Z</dc:date>
    </item>
    <item>
      <title>Re: Medallion architecture, how to update Gold tables?</title>
      <link>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8152#M3863</link>
      <description>&lt;P&gt;@Mohammad Saber​&amp;nbsp;&lt;/P&gt;&lt;P&gt;Why not using MERGE? Or even CDF + Merge to do increments.&lt;/P&gt;&lt;P&gt;&lt;A href="https://sarnendude.com/delta-lakes-change-data-feed-cdf-demo-in-azure-databricks/" target="test_blank"&gt;https://sarnendude.com/delta-lakes-change-data-feed-cdf-demo-in-azure-databricks/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2023 14:18:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8152#M3863</guid>
      <dc:creator>daniel_sahal</dc:creator>
      <dc:date>2023-03-07T14:18:46Z</dc:date>
    </item>
    <item>
      <title>Re: Medallion architecture, how to update Gold tables?</title>
      <link>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8153#M3864</link>
      <description>&lt;P&gt;When there is aggregated value in the Gold table, I think it should be calculated using all records in the Silver table. &lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2023 21:24:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8153#M3864</guid>
      <dc:creator>Mado</dc:creator>
      <dc:date>2023-03-07T21:24:57Z</dc:date>
    </item>
    <item>
      <title>Re: Medallion architecture, how to update Gold tables?</title>
      <link>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8154#M3865</link>
      <description>&lt;P&gt;Hi @Mohammad Saber​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Please help us select the best solution by clicking on "Select As Best" if it does.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Your feedback will help us ensure that we are providing the best possible service to you. Thank you!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 01 Apr 2023 01:10:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8154#M3865</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-04-01T01:10:16Z</dc:date>
    </item>
    <item>
      <title>Re: Medallion architecture, how to update Gold tables?</title>
      <link>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8155#M3866</link>
      <description>&lt;P&gt;Hi @Vidula Khanna​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The answer didn't fit my question. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In the case of using Merge, I found a good article here:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://medium.com/@avnishjain22/simplify-optimise-and-improve-your-data-pipelines-with-incremental-etl-on-the-lakehouse-61b279afadea" alt="https://medium.com/@avnishjain22/simplify-optimise-and-improve-your-data-pipelines-with-incremental-etl-on-the-lakehouse-61b279afadea" target="_blank"&gt;https://medium.com/@avnishjain22/simplify-optimise-and-improve-your-data-pipelines-with-incremental-etl-on-the-lakehouse-61b279afadea&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 01 Apr 2023 05:47:33 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/medallion-architecture-how-to-update-gold-tables/m-p/8155#M3866</guid>
      <dc:creator>Mado</dc:creator>
      <dc:date>2023-04-01T05:47:33Z</dc:date>
    </item>
  </channel>
</rss>

