cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

DeltaFileNotFoundException: [DELTA_TRUNCATED_TRANSACTION_LOG] Error in Streaming Table with Minimal

minhhung0507
Contributor

Dear Databricks Experts,

I am encountering a recurring issue while working with Delta streaming tables in my system. The error message is as follows:

minhhung0507_0-1738728278906.png

 

com.databricks.sql.transaction.tahoe.DeltaFileNotFoundException: [DELTA_TRUNCATED_TRANSACTION_LOG] gs://cimb-prod-lakehouse/bronze-layer/icoredb/dpb_revi_loan/_delta_log/00000000000000000000.json: Unable to reconstruct state at version 899 as the transaction log has been truncated due to manual deletion or the log retention policy (delta.logRetentionDuration=3 days) and checkpoint retention policy (delta.checkpointRetentionDuration=2 days)

Context:

  • I am designing a system that uses Delta format for streaming tables.
  • The affected tables have very few transactions or updates, which seems to make them prone to this error.
  • Upon inspecting the _delta_log directory, I noticed that only checkpoint versions 900 and 979 exist. However, the error indicates that it is trying to read from version 899minhhung0507_1-1738728343460.png

     

Questions:

  1. Why is Databricks attempting to access version 899 when the checkpoint files available start from version 900? Could this be a bug or misconfiguration in Delta Lake's automatic cleanup process?
  2. Is it possible that Delta Lake's log and checkpoint retention policies are prematurely removing active checkpoints for tables with minimal updates? If so, how can I adjust these settings to prevent this issue?
  3. What are the recommended best practices for managing retention policies (delta.logRetentionDuration and delta.checkpointRetentionDuration) for Delta tables with infrequent updates?

Additional Information:

  • Retention settings:
    • delta.logRetentionDuration = "3 days"
    • delta.checkpointRetentionDuration = "2 days"

I would greatly appreciate any insights or suggestions on how to resolve this issue and prevent it from occurring in the future.

Thank you!

Hung Nguyen
4 REPLIES 4

minhhung0507
Contributor

Hi, does anyone have any suggestions for this topic?

Hung Nguyen

holly
Databricks Employee
Databricks Employee

Without knowing the read patterns it's hard to say what the checkpointing issue is. But I'd recommend leaving the default retention periods for log and checkpoint locations if your table's not updated that often. I'd rarely recommend lower than 7 days unless you had some very large fast pipeline.

I've also never seen someone set checkpoint retention differently from log retention. Not saying it's wrong, just never seen it before. 

I'd also recommend looking into predictive optimisation - it's a great way to manage stale files without having to think about it much. 

And the reason I had to set log retention and checkpoint retention to less than 7 days is that if I leave the default values, my pipeline will get a 'Listing file' error which we don't know how to fix yet. So the temporary solution is to reduce the default values ​​to less than 7 days.

Hung Nguyen

minhhung0507
Contributor

Hi @holly , 

Thanks for the suggestions and solutions you gave, I will try to apply them again and check the results.

Hung Nguyen

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group