<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to manage data reload in DLT in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-manage-data-reload-in-dlt/m-p/54257#M30031</link>
    <description>&lt;P&gt;Hi, Community members&lt;/P&gt;&lt;P&gt;I had an situation to reload some data via DLT pipeline.&amp;nbsp; All data are stored in landing storage account and they have been loaded in daily base. For example, from 1/Nov to 30/Nov.&lt;/P&gt;&lt;P&gt;For some reason, I need to reload the data of 25/Nov and I tried to use the following parameters to force the data relaod:&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.includeExistingFiles"&lt;/SPAN&gt;&lt;SPAN&gt;, includeExistingFiles)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"modifiedBefore"&lt;/SPAN&gt;&lt;SPAN&gt;,modifiedBefore)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"modifiedAfter"&lt;/SPAN&gt;&lt;SPAN&gt;,modifiedAfter)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, no data are loaded or reloaded, even I deleted the data of the day in the bronze table. I guess it might because the checkpoint does not allow me to reload the data.&amp;nbsp; and I end up with reload the data to a new schema which is not a desired outcome.&lt;/P&gt;&lt;P&gt;Could you please advise how I should manage the data reload scenario?&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 30 Nov 2023 00:58:22 GMT</pubDate>
    <dc:creator>harvey-c</dc:creator>
    <dc:date>2023-11-30T00:58:22Z</dc:date>
    <item>
      <title>How to manage data reload in DLT</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-manage-data-reload-in-dlt/m-p/54257#M30031</link>
      <description>&lt;P&gt;Hi, Community members&lt;/P&gt;&lt;P&gt;I had an situation to reload some data via DLT pipeline.&amp;nbsp; All data are stored in landing storage account and they have been loaded in daily base. For example, from 1/Nov to 30/Nov.&lt;/P&gt;&lt;P&gt;For some reason, I need to reload the data of 25/Nov and I tried to use the following parameters to force the data relaod:&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.includeExistingFiles"&lt;/SPAN&gt;&lt;SPAN&gt;, includeExistingFiles)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"modifiedBefore"&lt;/SPAN&gt;&lt;SPAN&gt;,modifiedBefore)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"modifiedAfter"&lt;/SPAN&gt;&lt;SPAN&gt;,modifiedAfter)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, no data are loaded or reloaded, even I deleted the data of the day in the bronze table. I guess it might because the checkpoint does not allow me to reload the data.&amp;nbsp; and I end up with reload the data to a new schema which is not a desired outcome.&lt;/P&gt;&lt;P&gt;Could you please advise how I should manage the data reload scenario?&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Nov 2023 00:58:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-manage-data-reload-in-dlt/m-p/54257#M30031</guid>
      <dc:creator>harvey-c</dc:creator>
      <dc:date>2023-11-30T00:58:22Z</dc:date>
    </item>
  </channel>
</rss>

