cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta table size not shrinking after Vacuum

Spenyo
New Contributor II

Hi team.


Everyday once we overwrite the last X month data in tables. So it generate a every day a larger amount of history. We don't use time travel so we don't need it.

What we done:

SET spark.databricks.delta.retentionDurationCheck.enabled = false 
ALTER TABLE table_name SET TBLPROPERTIES ('delta.logRetentionDuration'='interval 48 HOURS', 'delta.deletedFileRetentionDuration'='interval 48 HOURS')

We run the following commands on delta tables  (full is table name)

       print("   OPTIMIZE,Vacuum")
      spark.sql("REORG TABLE {} APPLY ( PURGE )".format(full))
      spark.sql("OPTIMIZE {}".format(full))
      spark.sql("VACUUM {} RETAIN 48 HOURS".format(full))

After this in my understanding the will be only 48 hour history and the file size must be shrink. 
After run the history looks like this:

chrome_KZMxPl8x1d.png

The history stay as is and the file size is the same.
Can you provide me some additional information what i doing wrong, or i misunderstood the concept?

 

1 REPLY 1

pabloaschieri
New Contributor

Hi, any update on this? Thanks