<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Columns archive_time, commit_time, archive_time always NULL when running cloud_files_state in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/columns-archive-time-commit-time-archive-time-always-null-when/m-p/5137#M1641</link>
    <description>&lt;P&gt;Am attempting to find the commit_time for a given file for a delta table using the cloud_files_state command. However, the archive_time, commit_time, and archive_time coluns are always NULL. I am running databrics runtime 11.3 and have also verified with runtime version 13.0ML.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="cloud_files_state"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/261iECF0EE3360AC15CF/image-size/large?v=v2&amp;amp;px=999" role="button" title="cloud_files_state" alt="cloud_files_state" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The issue has also been adressed in the following post: &lt;A href="https://community.databricks.com/s/question/0D58Y00009gd0TDSAY/auto-loader-empty-fields-discoverytime-committime-archivetime-in-cloudfilesstate" alt="https://community.databricks.com/s/question/0D58Y00009gd0TDSAY/auto-loader-empty-fields-discoverytime-committime-archivetime-in-cloudfilesstate" target="_blank"&gt;https://community.databricks.com/s/question/0D58Y00009gd0TDSAY/auto-loader-empty-fields-discoverytime-committime-archivetime-in-cloudfilesstate&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is this a bug? Is any fix available?&lt;/P&gt;</description>
    <pubDate>Thu, 27 Apr 2023 07:31:46 GMT</pubDate>
    <dc:creator>MRTN</dc:creator>
    <dc:date>2023-04-27T07:31:46Z</dc:date>
    <item>
      <title>Columns archive_time, commit_time, archive_time always NULL when running cloud_files_state</title>
      <link>https://community.databricks.com/t5/data-engineering/columns-archive-time-commit-time-archive-time-always-null-when/m-p/5137#M1641</link>
      <description>&lt;P&gt;Am attempting to find the commit_time for a given file for a delta table using the cloud_files_state command. However, the archive_time, commit_time, and archive_time coluns are always NULL. I am running databrics runtime 11.3 and have also verified with runtime version 13.0ML.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="cloud_files_state"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/261iECF0EE3360AC15CF/image-size/large?v=v2&amp;amp;px=999" role="button" title="cloud_files_state" alt="cloud_files_state" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The issue has also been adressed in the following post: &lt;A href="https://community.databricks.com/s/question/0D58Y00009gd0TDSAY/auto-loader-empty-fields-discoverytime-committime-archivetime-in-cloudfilesstate" alt="https://community.databricks.com/s/question/0D58Y00009gd0TDSAY/auto-loader-empty-fields-discoverytime-committime-archivetime-in-cloudfilesstate" target="_blank"&gt;https://community.databricks.com/s/question/0D58Y00009gd0TDSAY/auto-loader-empty-fields-discoverytime-committime-archivetime-in-cloudfilesstate&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is this a bug? Is any fix available?&lt;/P&gt;</description>
      <pubDate>Thu, 27 Apr 2023 07:31:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/columns-archive-time-commit-time-archive-time-always-null-when/m-p/5137#M1641</guid>
      <dc:creator>MRTN</dc:creator>
      <dc:date>2023-04-27T07:31:46Z</dc:date>
    </item>
    <item>
      <title>Re: Columns archive_time, commit_time, archive_time always NULL when running cloud_files_state</title>
      <link>https://community.databricks.com/t5/data-engineering/columns-archive-time-commit-time-archive-time-always-null-when/m-p/5138#M1642</link>
      <description>&lt;P&gt;@Morten Stakkeland​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The issue you are facing with the cloud_files_state command is a known limitation in Delta Lake as of the latest stable release (Delta Lake 1.0). The commit_time and protocol columns are always null, and the archive_time column is also null for most files. This is because Delta Lake does not track commit_time and protocol for files written through the cloud storage API, and archive_time is only set when the file is actively being managed by Delta Lake's retention mechanism.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;There is a feature request to address this limitation and provide more accurate commit_time and protocol information for files written through cloud storage APIs, but it is currently not implemented. You can track the status of this feature request in the Delta Lake Github repository. As for archive_time , if you need to track it for a specific file, you can use the delta.log method to inspect the commit history and find the commit that created or deleted the file. From there, you can use the versionAsOf method to read the table as it existed at that commit and inspect the archive_time column.&lt;/P&gt;</description>
      <pubDate>Fri, 28 Apr 2023 17:57:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/columns-archive-time-commit-time-archive-time-always-null-when/m-p/5138#M1642</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-04-28T17:57:11Z</dc:date>
    </item>
  </channel>
</rss>

