<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: vacuum does not work as expected in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/vacuum-does-not-work-as-expected/m-p/122508#M46798</link>
    <description>&lt;P&gt;Could you print out and provide the values of the 2 parameters?&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/115637"&gt;@Ramukamath1988&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&amp;nbsp;this is&amp;nbsp;preciously my observation after vacuuming. I do understand these 2 parameters, but its&amp;nbsp; not working as expected. Even after vacuuming(retention for 30 days)&amp;nbsp; we can go back 2 months and log are retained for more than 3 months&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;Are the data that are 2 months old still referenced in the current data?&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 23 Jun 2025 09:07:12 GMT</pubDate>
    <dc:creator>Raghavan93513</dc:creator>
    <dc:date>2025-06-23T09:07:12Z</dc:date>
    <item>
      <title>vacuum does not work as expected</title>
      <link>https://community.databricks.com/t5/data-engineering/vacuum-does-not-work-as-expected/m-p/122475#M46786</link>
      <description>&lt;P&gt;&lt;SPAN&gt;The &lt;STRONG&gt;delta.logRetentionDuration (default 30 Days) is &lt;/STRONG&gt;&amp;nbsp;generally not set on any table in my workspace. As per the documentation you can time travel within duration of log retention provided &lt;STRONG&gt;&lt;U&gt;delta.deletedFileRetentionDuration &lt;/U&gt;&lt;/STRONG&gt;also set for&lt;STRONG&gt;&lt;U&gt; 30days&lt;/U&gt;&lt;/STRONG&gt;. Which is the case for my below example. We do vacuum with retain for 30days every weekend.&lt;BR /&gt;&lt;BR /&gt;2 questions I have on this matter&lt;/SPAN&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;Why I can still go back to April 22nd in version which is more than 30 days?&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Why version numbers starts from 100, what happened to previous versions?&lt;BR /&gt;&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;SPAN&gt;&lt;A href="https://docs.databricks.com/gcp/en/delta/history" target="_blank" rel="noopener"&gt;https://docs.databricks.com/gcp/en/delta/history&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;#delta #&lt;/SPAN&gt;vacuum&lt;/P&gt;</description>
      <pubDate>Sun, 22 Jun 2025 18:50:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/vacuum-does-not-work-as-expected/m-p/122475#M46786</guid>
      <dc:creator>Ramukamath1988</dc:creator>
      <dc:date>2025-06-22T18:50:14Z</dc:date>
    </item>
    <item>
      <title>Re: vacuum does not work as expected</title>
      <link>https://community.databricks.com/t5/data-engineering/vacuum-does-not-work-as-expected/m-p/122490#M46792</link>
      <description>&lt;P&gt;&lt;SPAN&gt;There are two configurations that govern your retention period:&lt;/SPAN&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;delta.deletedFileRetentionDuration - This configuration specifies how long Delta's transaction logs are kept in the history. The default retention period is 30 days, after which older log entries may be deleted.&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;delta.logRetentionDuration - This setting determines the retention period for stale data files that are no longer referenced in the transaction log. Stale files remain available for a default retention period of 7 days before they are eligible for deletion via the VACUUM command.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Now, based on the above provided context, I will answer your questions:&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Q) Why I can still go back to April 22nd in version which is more than 30 days?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;You can't access the data beyond 7 days because delta.logRetentionDuration by default is 7 days. So, if you run the VACUUM operation after 7 days, those data files will have been deleted.&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Q) Why version numbers starts from 100, what happened to previous versions?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;You can only see versions up to 30 days old because the default value of delta.deletedFileRetentionDuration is 30 days.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jun 2025 06:20:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/vacuum-does-not-work-as-expected/m-p/122490#M46792</guid>
      <dc:creator>Raghavan93513</dc:creator>
      <dc:date>2025-06-23T06:20:25Z</dc:date>
    </item>
    <item>
      <title>Re: vacuum does not work as expected</title>
      <link>https://community.databricks.com/t5/data-engineering/vacuum-does-not-work-as-expected/m-p/122503#M46797</link>
      <description>&lt;P&gt;&amp;nbsp;this is&amp;nbsp;preciously my observation after vacuuming. I do understand these 2 parameters, but its&amp;nbsp; not working as expected. Even after vacuuming(retention for 30 days)&amp;nbsp; we can go back 2 months and log are retained for more than 3 months&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jun 2025 07:41:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/vacuum-does-not-work-as-expected/m-p/122503#M46797</guid>
      <dc:creator>Ramukamath1988</dc:creator>
      <dc:date>2025-06-23T07:41:47Z</dc:date>
    </item>
    <item>
      <title>Re: vacuum does not work as expected</title>
      <link>https://community.databricks.com/t5/data-engineering/vacuum-does-not-work-as-expected/m-p/122508#M46798</link>
      <description>&lt;P&gt;Could you print out and provide the values of the 2 parameters?&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/115637"&gt;@Ramukamath1988&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&amp;nbsp;this is&amp;nbsp;preciously my observation after vacuuming. I do understand these 2 parameters, but its&amp;nbsp; not working as expected. Even after vacuuming(retention for 30 days)&amp;nbsp; we can go back 2 months and log are retained for more than 3 months&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;Are the data that are 2 months old still referenced in the current data?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jun 2025 09:07:12 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/vacuum-does-not-work-as-expected/m-p/122508#M46798</guid>
      <dc:creator>Raghavan93513</dc:creator>
      <dc:date>2025-06-23T09:07:12Z</dc:date>
    </item>
  </channel>
</rss>

