Hey @patilsuhasv ,
Yes, you absolutely can maintain 7 years of transactional data in a Delta table while having only 7 days of log retention. These are two separate concepts that work independently.
Understanding the Difference
Log Retention controls how long Delta Lake keeps the transaction log history (used for time travel and versioning capabilities). This doesn't delete your actual data.
Data Retention refers to how long you keep the actual data records in your table. Your current data remains in the table regardless of log retention settings.
Key Configuration
There are two main properties to understand:
delta.logRetentionDuration: Controls transaction log history (default 30 days)
- This affects your ability to time travel to previous versions
- Setting this to 7 days means you can only query versions from the last 7 days
delta.deletedFileRetentionDuration: Controls when VACUUM removes old data files (default 7 days)
- This only affects files that are no longer part of the current table (due to updates, deletes, or merges)
- Does NOT delete your current active data
Solution for Your Use Case
To keep 7 years of data with 7 days of log retention:
```sql
-- Set log retention to 7 days
ALTER TABLE your_table SET TBLPROPERTIES (
'delta.logRetentionDuration' = 'interval 7 days'
);
```
Your current transactional data will remain in the table indefinitely. The 7-day log retention simply means:
- You can only time travel back 7 days
- Transaction logs older than 7 days will be cleaned up automatically
Important Considerations
What Gets Kept:
- All current data in your table (regardless of age) stays
- Only historical versions beyond 7 days become inaccessible for time travel
- Active records from 7 years ago remain queryable as part of the current table state
What Gets Cleaned:
- Transaction logs older than 7 days
- Old data file versions (if you run VACUUM) that are no longer part of the current table
Best Practices:
- Don't run VACUUM frequently if you need longer time travel capabilities
- If you never update/delete records, your data files stay intact even with short log retention
- Consider your compliance and audit requirements before reducing log retention
- Storage costs are primarily driven by data volume, not log retention
Example Scenario
If you have a transaction table with records from 2018-2025:
- With 7 days log retention: All transactions from 2018-2025 remain queryable in the current table
- You just can't time travel to see what the table looked like 30 days ago
- The actual transaction records themselves are NOT deleted
The log retention setting is about versioning history, not data lifecycle management. Your 7-year-old transactions will remain accessible in your table's current state.
Hope this helps, Louis.