<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Retention Period for Parquet Data in e.g. S3 After Dropping a Managed Delta Table in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114495#M44843</link>
    <description>&lt;P&gt;Hey community,&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;I have a question regarding the data retention policy for managed Delta tables stored e.g. in Amazon S3.&lt;/SPAN&gt; Specifically:​&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;When a managed Delta table is dropped, what is the retention period for the underlying Parquet data files in S3 before they are permanently deleted?&lt;/SPAN&gt;​&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;I understand that Unity Catalog supports the UNDROP TABLE command to recover dropped managed tables within 7 days.&lt;/SPAN&gt; &lt;SPAN class=""&gt;However, I am interested in understanding the total duration the data remains in S3 before it is permanently removed.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;In the past the documentation mentioned 30 days but I cannot find this information in the current documentation. I guess this was updated since in the german Azure documentation the 30 days are still mentioned: &lt;A href="https://learn.microsoft.com/de-de/azure/databricks/sql/language-manual/sql-ref-syntax-ddl-drop-table" target="_self"&gt;https://learn.microsoft.com/de-de/azure/databricks/sql/language-manual/sql-ref-syntax-ddl-drop-table&lt;/A&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;Additionally, is there a way to configure this retention period, or expedite the deletion process if immediate removal of data is required?&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;Thank you!&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 04 Apr 2025 09:16:29 GMT</pubDate>
    <dc:creator>Volker</dc:creator>
    <dc:date>2025-04-04T09:16:29Z</dc:date>
    <item>
      <title>Retention Period for Parquet Data in e.g. S3 After Dropping a Managed Delta Table</title>
      <link>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114495#M44843</link>
      <description>&lt;P&gt;Hey community,&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;I have a question regarding the data retention policy for managed Delta tables stored e.g. in Amazon S3.&lt;/SPAN&gt; Specifically:​&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;When a managed Delta table is dropped, what is the retention period for the underlying Parquet data files in S3 before they are permanently deleted?&lt;/SPAN&gt;​&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;I understand that Unity Catalog supports the UNDROP TABLE command to recover dropped managed tables within 7 days.&lt;/SPAN&gt; &lt;SPAN class=""&gt;However, I am interested in understanding the total duration the data remains in S3 before it is permanently removed.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;In the past the documentation mentioned 30 days but I cannot find this information in the current documentation. I guess this was updated since in the german Azure documentation the 30 days are still mentioned: &lt;A href="https://learn.microsoft.com/de-de/azure/databricks/sql/language-manual/sql-ref-syntax-ddl-drop-table" target="_self"&gt;https://learn.microsoft.com/de-de/azure/databricks/sql/language-manual/sql-ref-syntax-ddl-drop-table&lt;/A&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;Additionally, is there a way to configure this retention period, or expedite the deletion process if immediate removal of data is required?&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;Thank you!&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 04 Apr 2025 09:16:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114495#M44843</guid>
      <dc:creator>Volker</dc:creator>
      <dc:date>2025-04-04T09:16:29Z</dc:date>
    </item>
    <item>
      <title>Re: Retention Period for Parquet Data in e.g. S3 After Dropping a Managed Delta Table</title>
      <link>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114512#M44846</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/98256"&gt;@Volker&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;The default retention period for managed Delta table data files in Unity Catalog is &lt;STRONG&gt;30 days&lt;/STRONG&gt;. I would check if there is a setting to reduce it to immediate removel.&lt;/P&gt;</description>
      <pubDate>Fri, 04 Apr 2025 12:02:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114512#M44846</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2025-04-04T12:02:52Z</dc:date>
    </item>
    <item>
      <title>Re: Retention Period for Parquet Data in e.g. S3 After Dropping a Managed Delta Table</title>
      <link>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114513#M44847</link>
      <description>&lt;P&gt;Thank you for your quick response already!&lt;BR /&gt;Would be great if this default retention period could again be mentioned in the docs.&lt;/P&gt;</description>
      <pubDate>Fri, 04 Apr 2025 12:05:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114513#M44847</guid>
      <dc:creator>Volker</dc:creator>
      <dc:date>2025-04-04T12:05:04Z</dc:date>
    </item>
    <item>
      <title>Re: Retention Period for Parquet Data in e.g. S3 After Dropping a Managed Delta Table</title>
      <link>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114516#M44849</link>
      <description>&lt;P&gt;No problem.&lt;/P&gt;
&lt;P&gt;Please see:&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/delta/table-properties" target="_blank"&gt;https://learn.microsoft.com/en-us/azure/databricks/delta/table-properties&lt;/A&gt;&amp;nbsp;it is mentioned 30 days for deleted files for Delta Tables.&lt;/P&gt;
&lt;UL class="p-rich_text_list p-rich_text_list__bullet p-rich_text_list--nested" data-stringify-type="unordered-list" data-list-tree="true" data-indent="0" data-border="0"&gt;
&lt;LI data-stringify-indent="0" data-stringify-border="0"&gt;&lt;CODE class="c-mrkdwn__code" data-stringify-type="code"&gt;delta.logRetentionDuration = "interval &amp;lt;interval&amp;gt;"&lt;/CODE&gt;: controls how long the history for a table is kept. The default is&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE class="c-mrkdwn__code" data-stringify-type="code"&gt;interval 30 days&lt;/CODE&gt;.&lt;/LI&gt;
&lt;LI data-stringify-indent="0" data-stringify-border="0"&gt;&lt;CODE class="c-mrkdwn__code" data-stringify-type="code"&gt;delta.deletedFileRetentionDuration = "interval &amp;lt;interval&amp;gt;"&lt;/CODE&gt;: determines the threshold&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE class="c-mrkdwn__code" data-stringify-type="code"&gt;VACUUM&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;uses to remove data files no longer referenced in the current table version. The default is&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE class="c-mrkdwn__code" data-stringify-type="code"&gt;interval 7 days&lt;/CODE&gt;.&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Fri, 04 Apr 2025 12:17:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114516#M44849</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2025-04-04T12:17:05Z</dc:date>
    </item>
    <item>
      <title>Re: Retention Period for Parquet Data in e.g. S3 After Dropping a Managed Delta Table</title>
      <link>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114517#M44850</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Thanks for the resources!&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;So, to adjust how long Parquet files are stored in the S3 bucket after I drop a table, I would need to adjust the&amp;nbsp;delta.logRetentionDuration, right?&lt;BR /&gt;And since dropping a Delta table marks the files for deletion after 7 days, I would need to wait 37 days for the files to be permanently deleted if I have the default settings, right?&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 04 Apr 2025 12:28:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/retention-period-for-parquet-data-in-e-g-s3-after-dropping-a/m-p/114517#M44850</guid>
      <dc:creator>Volker</dc:creator>
      <dc:date>2025-04-04T12:28:18Z</dc:date>
    </item>
  </channel>
</rss>

