<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Need an advice of someone with practical experience in DLT in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/need-an-advice-of-someone-with-practical-experience-in-dlt/m-p/101876#M40863</link>
    <description>&lt;P&gt;Thankyou&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/106294"&gt;@Alberto_Umana&lt;/a&gt;, this was what I needed. It solved my problem, thankss!&lt;/P&gt;</description>
    <pubDate>Thu, 12 Dec 2024 08:41:19 GMT</pubDate>
    <dc:creator>Fatimah-Tariq</dc:creator>
    <dc:date>2024-12-12T08:41:19Z</dc:date>
    <item>
      <title>Need an advice of someone with practical experience in DLT</title>
      <link>https://community.databricks.com/t5/data-engineering/need-an-advice-of-someone-with-practical-experience-in-dlt/m-p/101585#M40733</link>
      <description>&lt;P&gt;Hi, I'm facing this scenario in my DLT pipeline where in my silver layer I'm doing some filtering to prevent my test data to go to silver schema and then in the end i'm using apply_changes to create the tables and I'm using sequence_by clause within that to keep the recently updated version of each record.&amp;nbsp;&lt;BR /&gt;Now, the issue is where in the recently updated version of a record, its test flag becomes true and hence it becomes a test record. Then with the logic, it gets filtered out by my code and by the time data reaches my sequence_by clause within apply_changes, the recently updated entry is already filtered out and since DLT does not support deletes so the prev version of record is considered the recent one and hence is getting forward to silver schema. Where, in reality, that version of record is outdated now and we do not need that in silver schema.&amp;nbsp;&lt;/P&gt;&lt;P&gt;in short, the outdated records are moving forward to silver schema because of filtering scenario.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Please suggest me what is the best approach to handle&amp;nbsp;this situation?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is my silver layer's code structure:&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/97035"&gt;@Dlt&lt;/a&gt;.view(name = bronze_dlt_view&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;def bronze_source():&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; //code to fetch tables from bronze and apply filtering&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;dlt.create_streaming_table(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;name = silver_table,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;table_properties= table_props&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;comment = "Silver table with MERGE into logic from bronze"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;dlt.apply_changes(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;target = silver_table,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;source = bronze_dlt_view,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;keys = primary_keys,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;sequence_by = col(sequence_col),&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;stored_as_scd_type = 1,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 10 Dec 2024 12:14:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-an-advice-of-someone-with-practical-experience-in-dlt/m-p/101585#M40733</guid>
      <dc:creator>Fatimah-Tariq</dc:creator>
      <dc:date>2024-12-10T12:14:24Z</dc:date>
    </item>
    <item>
      <title>Re: Need an advice of someone with practical experience in DLT</title>
      <link>https://community.databricks.com/t5/data-engineering/need-an-advice-of-someone-with-practical-experience-in-dlt/m-p/101645#M40757</link>
      <description>&lt;P class="p1"&gt;To address the issue of outdated records moving forward to the silver schema in your Delta Live Tables (DLT) pipeline, you can consider the following approach&lt;/P&gt;
&lt;P class="p1"&gt;&lt;STRONG&gt;Modify the Filtering Logic&lt;/STRONG&gt;: Instead of filtering out the test records before the apply_changes function, you can handle the filtering within the apply_changes function itself. This way, the sequence of records is maintained correctly, and the outdated records are not propagated forward&lt;/P&gt;
&lt;P class="p1"&gt;&lt;STRONG&gt;Use the apply_as_deletes Parameter&lt;/STRONG&gt;: You can use the apply_as_deletes parameter within the apply_changes function to mark records as deleted based on your test flag. This ensures that the records with the test flag set to true are treated as deletions and are not carried forward.&lt;/P&gt;
&lt;P class="p1"&gt;By handling the filtering within the apply_changes function, you ensure that the most recent version of each record is correctly processed and outdated records are not moved forward to the silver schema&lt;/P&gt;
&lt;P class="p1"&gt;Please see: &lt;A href="https://docs.databricks.com/en/delta-live-tables/cdc.html" target="_blank"&gt;https://docs.databricks.com/en/delta-live-tables/cdc.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Dec 2024 18:00:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-an-advice-of-someone-with-practical-experience-in-dlt/m-p/101645#M40757</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2024-12-10T18:00:32Z</dc:date>
    </item>
    <item>
      <title>Re: Need an advice of someone with practical experience in DLT</title>
      <link>https://community.databricks.com/t5/data-engineering/need-an-advice-of-someone-with-practical-experience-in-dlt/m-p/101876#M40863</link>
      <description>&lt;P&gt;Thankyou&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/106294"&gt;@Alberto_Umana&lt;/a&gt;, this was what I needed. It solved my problem, thankss!&lt;/P&gt;</description>
      <pubDate>Thu, 12 Dec 2024 08:41:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/need-an-advice-of-someone-with-practical-experience-in-dlt/m-p/101876#M40863</guid>
      <dc:creator>Fatimah-Tariq</dc:creator>
      <dc:date>2024-12-12T08:41:19Z</dc:date>
    </item>
  </channel>
</rss>

