<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Strategy to add new table base on silver data in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/strategy-to-add-new-table-base-on-silver-data/m-p/50341#M28771</link>
    <description>&lt;P&gt;I have a merge function for streaming foreachBatch kind of&lt;BR /&gt;mergedf(df,i):&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; merge_func_1(df,i)&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;merge_func_2(df,i)&lt;/P&gt;&lt;P&gt;Then I want to add new merge_func_3 into it.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there any best practices for this case? when streaming always runs, how can I process data from beginning for merge_func_3 without stopping streaming then create another temp job to run for func_3, then run streaming again with adding func_3&lt;/P&gt;</description>
    <pubDate>Thu, 02 Nov 2023 10:42:17 GMT</pubDate>
    <dc:creator>Joe1912</dc:creator>
    <dc:date>2023-11-02T10:42:17Z</dc:date>
    <item>
      <title>Strategy to add new table base on silver data</title>
      <link>https://community.databricks.com/t5/data-engineering/strategy-to-add-new-table-base-on-silver-data/m-p/50341#M28771</link>
      <description>&lt;P&gt;I have a merge function for streaming foreachBatch kind of&lt;BR /&gt;mergedf(df,i):&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; merge_func_1(df,i)&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;merge_func_2(df,i)&lt;/P&gt;&lt;P&gt;Then I want to add new merge_func_3 into it.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there any best practices for this case? when streaming always runs, how can I process data from beginning for merge_func_3 without stopping streaming then create another temp job to run for func_3, then run streaming again with adding func_3&lt;/P&gt;</description>
      <pubDate>Thu, 02 Nov 2023 10:42:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/strategy-to-add-new-table-base-on-silver-data/m-p/50341#M28771</guid>
      <dc:creator>Joe1912</dc:creator>
      <dc:date>2023-11-02T10:42:17Z</dc:date>
    </item>
  </channel>
</rss>

