<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: commit time is coming as null in autoloader in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/123675#M47052</link>
    <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/170700"&gt;@shrutikatyal&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;I believe the only current route to get a discount voucher would be the following:&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.databricks.com/t5/events/dais-2025-virtual-learning-festival-11-june-02-july-2025/ev-p/119323" target="_blank"&gt;https://community.databricks.com/t5/events/dais-2025-virtual-learning-festival-11-june-02-july-2025/ev-p/119323&lt;/A&gt;&lt;BR /&gt;I think it’s the last day of the event so you might need to be quick!&lt;BR /&gt;hope this helps,&lt;BR /&gt;TheOC&lt;/P&gt;</description>
    <pubDate>Wed, 02 Jul 2025 12:46:18 GMT</pubDate>
    <dc:creator>TheOC</dc:creator>
    <dc:date>2025-07-02T12:46:18Z</dc:date>
    <item>
      <title>commit time is coming as null in autoloader</title>
      <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122213#M46699</link>
      <description>&lt;P&gt;As per the databricks new feature in autoloader that we can use archival and move feature in autoloader however I am trying to use that feature using databricks 16.4.x.scala2.12 however commit time is still coming null as its mentioned in the documentation if commit time is null this feature won't work . how can I resolve it?&lt;/P&gt;&lt;P&gt;I am using below autoloader configuration:&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;df&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;spark.readStream.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudfiles"&lt;/SPAN&gt;&lt;SPAN&gt;)\&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudfiles.format"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;"json"&lt;/SPAN&gt;&lt;SPAN&gt;)\&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudfiles.schemaLocation"&lt;/SPAN&gt;&lt;SPAN&gt;,checkpoint_path)\&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"multiLine"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"True"&lt;/SPAN&gt;&lt;SPAN&gt;)\&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.backfillInterval"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;SPAN&gt;"10 minutes"&lt;/SPAN&gt;&lt;SPAN&gt;)\&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"cloudFiles.inferColumnTypes"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"True"&lt;/SPAN&gt;&lt;SPAN&gt;)\&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;load&lt;/SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;(ingestDirectory)&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;dfOutput &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; df.writeStream&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;trigger&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;once&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"delta"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"mergeSchema"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"true"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"checkpointLocation"&lt;/SPAN&gt;&lt;SPAN&gt;,checkpoint_path)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;start&lt;/SPAN&gt;&lt;SPAN&gt;(CuratedDirectory)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;dfOutput.&lt;/SPAN&gt;&lt;SPAN&gt;awaitTermination&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2025 07:09:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122213#M46699</guid>
      <dc:creator>shrutikatyal</dc:creator>
      <dc:date>2025-06-19T07:09:40Z</dc:date>
    </item>
    <item>
      <title>Re: commit time is coming as null in autoloader</title>
      <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122278#M46728</link>
      <description>&lt;P&gt;&lt;BR /&gt;You're right — the new **archival and move feature in Auto Loader** depends on the `_commit_timestamp` column. If that value is coming as `null`, the feature won't work, as mentioned in the documentation.&lt;/P&gt;&lt;P&gt;To fix this, you need to make sure you're explicitly enabling the `commitTime` metadata column using the following option in your Auto Loader configuration:&lt;/P&gt;&lt;P&gt;```python&lt;BR /&gt;.option("cloudFiles.addColumns", "commitTime")&lt;BR /&gt;```&lt;/P&gt;&lt;P&gt;This ensures that the `_commit_timestamp` field gets populated during ingestion.&lt;/P&gt;&lt;P&gt;Here’s the corrected version of your code:&lt;/P&gt;&lt;P&gt;```python&lt;BR /&gt;df = (&lt;BR /&gt;spark.readStream.format("cloudfiles")&lt;BR /&gt;.option("cloudFiles.format", "json")&lt;BR /&gt;.option("cloudFiles.schemaLocation", checkpoint_path)&lt;BR /&gt;.option("multiLine", "true")&lt;BR /&gt;.option("cloudFiles.backfillInterval", "10 minutes")&lt;BR /&gt;.option("cloudFiles.inferColumnTypes", "true")&lt;BR /&gt;.option("cloudFiles.addColumns", "commitTime") # Required to get _commit_timestamp&lt;BR /&gt;.load(ingestDirectory)&lt;BR /&gt;)&lt;BR /&gt;```&lt;/P&gt;&lt;P&gt;Once this is set, `_commit_timestamp` should be populated properly, and the archival/move feature should start working as expected.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2025 17:10:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122278#M46728</guid>
      <dc:creator>Yogesh_Verma_</dc:creator>
      <dc:date>2025-06-19T17:10:32Z</dc:date>
    </item>
    <item>
      <title>Re: commit time is coming as null in autoloader</title>
      <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122418#M46767</link>
      <description>&lt;P&gt;hi Yogesh,&lt;BR /&gt;I am getting below error if I am trying to add this in my autoloader configuration i.e. option (&lt;SPAN&gt;"cloudFiles.addColumns", "commitTime").&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Jun 2025 07:48:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122418#M46767</guid>
      <dc:creator>shrutikatyal</dc:creator>
      <dc:date>2025-06-21T07:48:04Z</dc:date>
    </item>
    <item>
      <title>Re: commit time is coming as null in autoloader</title>
      <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122419#M46768</link>
      <description>&lt;P&gt;Attached all screenshots for reference&lt;/P&gt;</description>
      <pubDate>Sat, 21 Jun 2025 07:52:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122419#M46768</guid>
      <dc:creator>shrutikatyal</dc:creator>
      <dc:date>2025-06-21T07:52:04Z</dc:date>
    </item>
    <item>
      <title>Re: commit time is coming as null in autoloader</title>
      <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122420#M46769</link>
      <description>&lt;P&gt;below i have attached screenshot for cluster configuration&lt;/P&gt;</description>
      <pubDate>Sat, 21 Jun 2025 07:54:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122420#M46769</guid>
      <dc:creator>shrutikatyal</dc:creator>
      <dc:date>2025-06-21T07:54:11Z</dc:date>
    </item>
    <item>
      <title>Re: commit time is coming as null in autoloader</title>
      <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122451#M46776</link>
      <description>&lt;P class="p1"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/170700"&gt;@shrutikatyal&lt;/a&gt;&amp;nbsp;&amp;nbsp;I believe the commit_time only functions when the &lt;STRONG&gt;cloudFiles.cleanSource&lt;/STRONG&gt; option is enabled.&amp;nbsp;I don't see this option present in your snippet. Could you please enable this option for the read and check?&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;Refer to the below documentation, which specifies that column commit_time is supported in&amp;nbsp;Databricks Runtime&amp;nbsp;16.4 and above when&amp;nbsp;cloudFiles.cleanSource is enabled&lt;/P&gt;
&lt;P class="p1"&gt;&lt;A href="https://docs.databricks.com/aws/en/sql/language-manual/functions/cloud_files_state" target="_blank"&gt;https://docs.databricks.com/aws/en/sql/language-manual/functions/cloud_files_state&lt;/A&gt;&lt;/P&gt;
&lt;P class="p2"&gt;&lt;SPAN&gt;A file might be processed but marked as committed arbitrarily later. commit_time is updated usually at the start of the next microbatch.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 22 Jun 2025 04:49:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122451#M46776</guid>
      <dc:creator>mani_22</dc:creator>
      <dc:date>2025-06-22T04:49:10Z</dc:date>
    </item>
    <item>
      <title>Re: commit time is coming as null in autoloader</title>
      <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122491#M46793</link>
      <description>&lt;P&gt;Thanks, its working now&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jun 2025 06:25:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/122491#M46793</guid>
      <dc:creator>shrutikatyal</dc:creator>
      <dc:date>2025-06-23T06:25:18Z</dc:date>
    </item>
    <item>
      <title>Re: commit time is coming as null in autoloader</title>
      <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/123670#M47051</link>
      <description>&lt;P&gt;hi,&lt;BR /&gt;I am interested in data bricks certified associate data Engineer certification can I get any voucher?&lt;/P&gt;</description>
      <pubDate>Wed, 02 Jul 2025 12:31:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/123670#M47051</guid>
      <dc:creator>shrutikatyal</dc:creator>
      <dc:date>2025-07-02T12:31:24Z</dc:date>
    </item>
    <item>
      <title>Re: commit time is coming as null in autoloader</title>
      <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/123675#M47052</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/170700"&gt;@shrutikatyal&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;I believe the only current route to get a discount voucher would be the following:&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.databricks.com/t5/events/dais-2025-virtual-learning-festival-11-june-02-july-2025/ev-p/119323" target="_blank"&gt;https://community.databricks.com/t5/events/dais-2025-virtual-learning-festival-11-june-02-july-2025/ev-p/119323&lt;/A&gt;&lt;BR /&gt;I think it’s the last day of the event so you might need to be quick!&lt;BR /&gt;hope this helps,&lt;BR /&gt;TheOC&lt;/P&gt;</description>
      <pubDate>Wed, 02 Jul 2025 12:46:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/123675#M47052</guid>
      <dc:creator>TheOC</dc:creator>
      <dc:date>2025-07-02T12:46:18Z</dc:date>
    </item>
    <item>
      <title>Re: commit time is coming as null in autoloader</title>
      <link>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/125084#M47335</link>
      <description>&lt;P&gt;hi,&lt;BR /&gt;I have done the certification using data bricks learning festival however I haven't got any voucher yet.&lt;BR /&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Mon, 14 Jul 2025 03:54:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/commit-time-is-coming-as-null-in-autoloader/m-p/125084#M47335</guid>
      <dc:creator>shrutikatyal</dc:creator>
      <dc:date>2025-07-14T03:54:50Z</dc:date>
    </item>
  </channel>
</rss>

