cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta Lake Commit Versions: Are Gaps Possible?

Keremmm
New Contributor II

Hi everyone,

I'm exploring how commit versions work in Delta Lake and have a question regarding their sequencing. Specifically, I'm curious whether commit versions are guaranteed to be dense and sequential, or if there are scenarios where gaps might occur between version numbers. For example, could concurrent operations or rollbacks result in non-contiguous commit versions? Any insights or references to documentation on this behavior would be greatly appreciated.

Thanks in advance for your help!

1 REPLY 1

Vidhi_Khaitan
Databricks Employee
Databricks Employee

Commit versions in Delta Lake are not guaranteed to be dense and sequential. There are scenarios where gaps might occur between version numbers. Specifically, the DELTA_VERSIONS_NOT_CONTIGUOUS error condition indicates that versions are not contiguous, which can happen when files have been manually removed from the Delta log or due to eventual consistency issues in storage systems like S3. Concurrent writes or modifications can lead to situations where commit versions remain sequential but have gaps due to conflicting transactions being rolled back.

Please refer to this ->
https://docs.databricks.com/aws/en/error-messages/delta-versions-not-contiguous-error-class

Even with non-contiguous commit versions, Delta Lake maintains system integrity using metadata structures, ensuring consistency across reads and writes