cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Transaction Log Failed Integrity Checks

Dave_Nithio
Contributor II

I have started to receive the following error message - that the transaction log has failed integrity checks - when attempting to optimize and run compaction on a table. It also occurs when I attempt to alter this table.

Dave_Nithio_1-1741718129217.png

This blocks my pipeline from running. What is strange is that I can run queries against the table without issue and all data is intact, but I cannot update the table. Other community messages have noted this error in the past, but the resolution involved simply updating the following spark setting and turning off the integrity check:

`spark.conf.set("spark.databricks.delta.state.corruptionIsFatal", False)`

I am concerned that just setting corruption to Fatal does not address the underlying problem down the line. I have found the only method to actually alleviate the issue is to copy the table to a new table and delete the original table. This maintains all of our data, but we lose the transaction history (basically resetting the transaction log to zero). I would prefer not to do this though so that we can still time travel. Does anyone have any advice on what might be causing this issue or how it can be resolved?

1 REPLY 1

mark_ott
Databricks Employee
Databricks Employee

Your issue—encountering "the transaction log has failed integrity checks" in Databricks Delta Lake—indicates metadata corruption or an inconsistency in the Delta transaction log (_delta_log). This commonly disrupts DML operations like OPTIMIZE, DELETE, INSERT, or schema evolution, while still allowing read-only queries, as the query engine can skip over many log problems during regular reads but not when writing or updating metadata.

What Causes Transaction Log Integrity Checks to Fail?

  • Manual interference: Direct changes in the _delta_log directory (such as moving, renaming or deleting JSON/CRC files) outside of supported APIs.

  • Storage layer instability: Issues in the underlying cloud storage (S3/ADLS/Blob), such as eventual consistency or filesystem caching.

  • Process interruption: Terminated write/merge jobs that leave partial state.

  • Concurrent operations: Unusual concurrency patterns or forced interruptions.

  • Bugs/edge cases: Less common, but occasionally bugs in Delta Lake can leave corrupt states after job failures or crashes, especially with third-party Delta implementations.

Risks of Disabling the Integrity Check

Switching spark.databricks.delta.state.corruptionIsFatal to False simply suppresses the error and lets the system skip the check—it does NOT fix your log or underlying corruption. While this can restore write access, you'll risk:

  • Data loss if underlying files are missing or inconsistent

  • Inability to audit or "time travel" reliably, as older table states may be gone or corrupt

  • Harder troubleshooting later

Safer Remediation Steps

1. Pinpoint Corruption

  • Use DESCRIBE HISTORY <table> and look for out-of-order, missing, or duplicate versions.

  • Inspect the _delta_log directory, looking for missing, truncated, or outright corrupted JSON/CRC files.

2. Restore or Repair Transaction Log

If you have a backup, restoring an older (healthy) state via the Delta Lake RESTORE command or by copying _delta_log from backup can recover the timeline.

  • Delta 2.0+ has experimental FSCK and VACUUM tools, but these are still in development and not always available.

  • Sometimes, Databricks Support can manually repair your _delta_log if you are an enterprise customer.

3. Export and Reload

If all else fails, copying data out (just as you did), then recreating the table, is the only way to guarantee integrity for the future—at the cost of transaction history.

4. Prevent Recurrence

  • Never modify files inside _delta_log directly.

  • Use only supported Databricks APIs for writes/merges.

  • Address any storage layer anomalies.

  • Use table access controls and monitoring.

Consult Databricks Support

If retaining your table history is business-critical, raise a support ticket with Databricks, referencing table path, workspace ID, error message, and a description of all recent operations. They have internal tools to repair many types of log corruption you cannot fix yourself.


In summary:
Do not rely on disabling corruptionIsFatal as a permanent solution—it hides symptoms, not causes. For enterprise/critical tables, escalate to Databricks Support for possible transaction log repair. For non-critical tables, or if support cannot help, copying/reloading will restore your table’s health but reset its log. Prevent further incidents by reviewing processes and ensuring only supported APIs and reliable storage layers are involved.


References

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now