cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to reduce storage space consumed by delta with many updates

Greg
New Contributor III

I have 1 delta table that I continuously append events into, and a 2nd delta table that I continuously merge into (streamed from the 1st table) that has unique ID's where properties are updated from the events (An ID represents a unique thing that gets many events). The actual data size of the 2nd table is ≈ 400MB, however due to delta versions it consumes ≈ 40GB. I have added vacuum every hour to the streaming process to keep it even this low. Any suggestions on how I can reduce this storage consumption further? I do not require the versioning. Ideally I could have some way to disable this while retaining the ability to MERGE.

1 REPLY 1

Jb11
New Contributor II

Did you already solved this problem?

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!