cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Columns archive_time, commit_time, archive_time always NULL when running cloud_files_state

MRTN
New Contributor III

Am attempting to find the commit_time for a given file for a delta table using the cloud_files_state command. However, the archive_time, commit_time, and archive_time coluns are always NULL. I am running databrics runtime 11.3 and have also verified with runtime version 13.0ML.

cloud_files_state 

The issue has also been adressed in the following post: https://community.databricks.com/s/question/0D58Y00009gd0TDSAY/auto-loader-empty-fields-discoverytim...

Is this a bug? Is any fix available?

1 REPLY 1

Anonymous
Not applicable

@Morten Stakkelandโ€‹ :

The issue you are facing with the cloud_files_state command is a known limitation in Delta Lake as of the latest stable release (Delta Lake 1.0). The commit_time and protocol columns are always null, and the archive_time column is also null for most files. This is because Delta Lake does not track commit_time and protocol for files written through the cloud storage API, and archive_time is only set when the file is actively being managed by Delta Lake's retention mechanism.

There is a feature request to address this limitation and provide more accurate commit_time and protocol information for files written through cloud storage APIs, but it is currently not implemented. You can track the status of this feature request in the Delta Lake Github repository. As for archive_time , if you need to track it for a specific file, you can use the delta.log method to inspect the commit history and find the commit that created or deleted the file. From there, you can use the versionAsOf method to read the table as it existed at that commit and inspect the archive_time column.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group