cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to reduce storage space consumed by delta with many updates

Greg
New Contributor III

I have 1 delta table that I continuously append events into, and a 2nd delta table that I continuously merge into (streamed from the 1st table) that has unique ID's where properties are updated from the events (An ID represents a unique thing that gets many events). The actual data size of the 2nd table is ≈ 400MB, however due to delta versions it consumes ≈ 40GB. I have added vacuum every hour to the streaming process to keep it even this low. Any suggestions on how I can reduce this storage consumption further? I do not require the versioning. Ideally I could have some way to disable this while retaining the ability to MERGE.

1 REPLY 1

Jb11
New Contributor II

Did you already solved this problem?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.