cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta table size not shrinking after Vacuum

Spenyo
New Contributor II

Hi team.


Everyday once we overwrite the last X month data in tables. So it generate a every day a larger amount of history. We don't use time travel so we don't need it.

What we done:

SET spark.databricks.delta.retentionDurationCheck.enabled = false 
ALTER TABLE table_name SET TBLPROPERTIES ('delta.logRetentionDuration'='interval 48 HOURS', 'delta.deletedFileRetentionDuration'='interval 48 HOURS')

We run the following commands on delta tables  (full is table name)

       print("   OPTIMIZE,Vacuum")
      spark.sql("REORG TABLE {} APPLY ( PURGE )".format(full))
      spark.sql("OPTIMIZE {}".format(full))
      spark.sql("VACUUM {} RETAIN 48 HOURS".format(full))

After this in my understanding the will be only 48 hour history and the file size must be shrink. 
After run the history looks like this:

chrome_KZMxPl8x1d.png

The history stay as is and the file size is the same.
Can you provide me some additional information what i doing wrong, or i misunderstood the concept?

 

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group