<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Procedure of retrieving archived data from delta table in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104455#M41757</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/88823"&gt;@Walter_C&lt;/a&gt;&amp;nbsp;Thank you for your reply. However, there are some part that might need your further clarification.&lt;/P&gt;&lt;P&gt;Assume I already set the&amp;nbsp;delta.timeUntilArchived to 1825days (5years) and I have configured the lifecycle policy align with databricks setting which move files after 5years creation to archive tier on storage side.&lt;/P&gt;&lt;P&gt;After a while, I have a requirement to retrieve data before 7years. I expect there are part of data moved to archive tier that need to be restored. Should I change the&amp;nbsp;delta.timeUntilArchived to 2555days (7years) or just keep it as is which is&amp;nbsp;1825days (5years)?&lt;/P&gt;&lt;P&gt;Also, I would like to confirm whether the procedure of restoring archive data as per my understanding is correct or not, below are what I thought:&lt;/P&gt;&lt;P&gt;Step 1: Run&amp;nbsp;SHOW ARCHIVED FILES to check what data files need to be move back to hot tier&lt;/P&gt;&lt;P&gt;Step 2: Move the files back to hot tier on storage side&lt;/P&gt;&lt;P&gt;Step 3: Update the&amp;nbsp;delta.timeUntilArchived setting to 2555days (7years) on Databricks side&lt;/P&gt;&lt;P&gt;I assume the procedure should be the same for both case &lt;STRONG&gt;1) before 7years&lt;/STRONG&gt; and &lt;STRONG&gt;2) whole time&lt;/STRONG&gt;, right?&lt;/P&gt;&lt;P&gt;Please kindly correct me if there is any misunderstanding. Thank you.&lt;/P&gt;</description>
    <pubDate>Tue, 07 Jan 2025 04:02:20 GMT</pubDate>
    <dc:creator>Brianben</dc:creator>
    <dc:date>2025-01-07T04:02:20Z</dc:date>
    <item>
      <title>Procedure of retrieving archived data from delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104269#M41699</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I am currently researching on the archive support features in Databricks.&amp;nbsp;&lt;A href="https://docs.databricks.com/en/optimizations/archive-delta.html" target="_blank"&gt;https://docs.databricks.com/en/optimizations/archive-delta.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Let say I have enabled archive support and configured the data to be archived after 5 years and I also configured lifecycle management policy to move data file to archive tier after 5 years.&lt;/P&gt;&lt;P&gt;I would like to know the procedure of retrieving those archive data. As per my understanding, I should move the corresponding data files from archive tier to hot tier on storage side first.&lt;/P&gt;&lt;P&gt;May I know what should I do on Databricks side if I want to retrieve the data &lt;STRONG&gt;1) before 7 years&lt;/STRONG&gt; and &lt;STRONG&gt;2) from the very beginning&lt;/STRONG&gt;?&lt;/P&gt;&lt;P&gt;Highly appreciate if someone can help me out with this question. Thanks in advance.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 06 Jan 2025 04:11:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104269#M41699</guid>
      <dc:creator>Brianben</dc:creator>
      <dc:date>2025-01-06T04:11:03Z</dc:date>
    </item>
    <item>
      <title>Re: Procedure of retrieving archived data from delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104328#M41705</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want to retrieve data before 7 years, ensure that the &lt;CODE&gt;delta.timeUntilArchived&lt;/CODE&gt; property is set to a value that reflects the archival policy (e.g., 5 years).&lt;BR /&gt;Restore the necessary files using the &lt;CODE&gt;SHOW ARCHIVED FILES&lt;/CODE&gt; command and follow the cloud provider's instructions to move the files back to the hot tier.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 06 Jan 2025 12:06:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104328#M41705</guid>
      <dc:creator>Walter_C</dc:creator>
      <dc:date>2025-01-06T12:06:22Z</dc:date>
    </item>
    <item>
      <title>Re: Procedure of retrieving archived data from delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104455#M41757</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/88823"&gt;@Walter_C&lt;/a&gt;&amp;nbsp;Thank you for your reply. However, there are some part that might need your further clarification.&lt;/P&gt;&lt;P&gt;Assume I already set the&amp;nbsp;delta.timeUntilArchived to 1825days (5years) and I have configured the lifecycle policy align with databricks setting which move files after 5years creation to archive tier on storage side.&lt;/P&gt;&lt;P&gt;After a while, I have a requirement to retrieve data before 7years. I expect there are part of data moved to archive tier that need to be restored. Should I change the&amp;nbsp;delta.timeUntilArchived to 2555days (7years) or just keep it as is which is&amp;nbsp;1825days (5years)?&lt;/P&gt;&lt;P&gt;Also, I would like to confirm whether the procedure of restoring archive data as per my understanding is correct or not, below are what I thought:&lt;/P&gt;&lt;P&gt;Step 1: Run&amp;nbsp;SHOW ARCHIVED FILES to check what data files need to be move back to hot tier&lt;/P&gt;&lt;P&gt;Step 2: Move the files back to hot tier on storage side&lt;/P&gt;&lt;P&gt;Step 3: Update the&amp;nbsp;delta.timeUntilArchived setting to 2555days (7years) on Databricks side&lt;/P&gt;&lt;P&gt;I assume the procedure should be the same for both case &lt;STRONG&gt;1) before 7years&lt;/STRONG&gt; and &lt;STRONG&gt;2) whole time&lt;/STRONG&gt;, right?&lt;/P&gt;&lt;P&gt;Please kindly correct me if there is any misunderstanding. Thank you.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jan 2025 04:02:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104455#M41757</guid>
      <dc:creator>Brianben</dc:creator>
      <dc:date>2025-01-07T04:02:20Z</dc:date>
    </item>
    <item>
      <title>Re: Procedure of retrieving archived data from delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104720#M41854</link>
      <description>&lt;P class="_1t7bu9h1 paragraph"&gt;To retrieve data before 7 years, you do not need to change the &lt;CODE&gt;delta.timeUntilArchived&lt;/CODE&gt; setting from 1825 days (5 years) to 2555 days (7 years). You can keep it as is. The procedure for restoring archived data is as follows:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;STRONG&gt;Run &lt;CODE&gt;SHOW ARCHIVED FILES&lt;/CODE&gt;&lt;/STRONG&gt;: Use the &lt;CODE&gt;SHOW ARCHIVED FILES&lt;/CODE&gt; command to identify the files that need to be moved back to the hot tier. The syntax is:&lt;/P&gt;
&lt;DIV class="gb5fhw2"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-sql _1t7bu9hb hljs language-sql gb5fhw3"&gt;&lt;SPAN class="hljs-keyword"&gt;SHOW&lt;/SPAN&gt; ARCHIVED FILES &lt;SPAN class="hljs-keyword"&gt;FOR&lt;/SPAN&gt; &lt;SPAN class="hljs-operator"&gt;&amp;lt;&lt;/SPAN&gt;table_name&lt;SPAN class="hljs-operator"&gt;&amp;gt;&lt;/SPAN&gt; [ &lt;SPAN class="hljs-keyword"&gt;WHERE&lt;/SPAN&gt; &lt;SPAN class="hljs-operator"&gt;&amp;lt;&lt;/SPAN&gt;predicate&lt;SPAN class="hljs-operator"&gt;&amp;gt;&lt;/SPAN&gt; ];&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/DIV&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;This operation returns URIs for archived files as a Spark DataFrame.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;STRONG&gt;Move Files Back to Hot Tier on Storage Side&lt;/STRONG&gt;: Restore the necessary archived files following documented instructions from your object storage provider.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;STRONG&gt;Update &lt;CODE&gt;delta.timeUntilArchived&lt;/CODE&gt; Setting&lt;/STRONG&gt;: If you need to access data older than the current archival threshold, update the &lt;CODE&gt;delta.timeUntilArchived&lt;/CODE&gt; setting to the new value (e.g., 2555 days for 7 years). This step ensures that Databricks recognizes the restored files as part of the active dataset.&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Wed, 08 Jan 2025 15:07:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104720#M41854</guid>
      <dc:creator>Walter_C</dc:creator>
      <dc:date>2025-01-08T15:07:38Z</dc:date>
    </item>
    <item>
      <title>Re: Procedure of retrieving archived data from delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104816#M41891</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/88823"&gt;@Walter_C&lt;/a&gt;&amp;nbsp;Thank you for the reply. However it is a bit confusing. In the beginning of your reply, you said I do not need to change the&amp;nbsp;&lt;SPAN&gt;delta.timeUntilArchived setting but in step 3 you said I have to update the setting.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Do you mean I should not change&amp;nbsp;&lt;SPAN&gt;delta.timeUntilArchived&lt;/SPAN&gt; before moving the file to hot tier but after I move them to hot tier I need to change the setting in order to query the restored data?&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Could you please elaborate more? Thank you very much.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jan 2025 01:53:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/procedure-of-retrieving-archived-data-from-delta-table/m-p/104816#M41891</guid>
      <dc:creator>Brianben</dc:creator>
      <dc:date>2025-01-09T01:53:39Z</dc:date>
    </item>
  </channel>
</rss>

