<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Problems with cluster shutdown in DLT in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131792#M49241</link>
    <description>&lt;P&gt;&lt;STRONG&gt;[Issue] DLT finishes processing, but cluster remains active due to log write error&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Hi everyone, I'm running into a problem with my DLT pipeline and was hoping someone here could help or has experienced something similar.&lt;/P&gt;&lt;H3&gt;Problem Description&lt;/H3&gt;&lt;P&gt;The pipeline completes data processing successfully, but the &lt;STRONG&gt;cluster stays active for a long time&lt;/STRONG&gt;, even though no data is being processed anymore.&lt;/P&gt;&lt;P&gt;After checking the &lt;STRONG&gt;Driver Logs&lt;/STRONG&gt;, I noticed that the system keeps trying to write &lt;STRONG&gt;execution logs and cluster information&lt;/STRONG&gt;, but encounters an error each time. As a result, it retries every minute and ends up stuck in this loop.&lt;/P&gt;&lt;H3&gt;Error Snippet&amp;nbsp;&lt;/H3&gt;&lt;P&gt;09/25/12 11:13:57 ERROR NativeADLGen2RequestComparisonHandler: Error in request comparison&lt;BR /&gt;java.lang.NumberFormatException: For input string: "Fri, 12 Sep 2025 11:13:58 GMT"&lt;BR /&gt;at java.base/java.lang.Long.parseLong(Long.java:711)&lt;BR /&gt;...&lt;BR /&gt;at com.databricks.sql.io.NativeADLGen2RequestComparisonHandler.do Handle(NativeADLGen2RequestComparisonHandler.Scala:94)&amp;nbsp;&lt;/P&gt;&lt;P&gt;It seems that when DLT tries to &lt;STRONG&gt;write to its own event log&lt;/STRONG&gt;, it first attempts to &lt;STRONG&gt;read the current log state&lt;/STRONG&gt; (e.g., Loading version 306944). The bug appears during this &lt;STRONG&gt;read operation&lt;/STRONG&gt;, where it throws a NumberFormatException when parsing a timestamp.&lt;/P&gt;&lt;H3&gt;Observations&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;The error &lt;STRONG&gt;does not crash the pipeline&lt;/STRONG&gt;, but it seems to &lt;STRONG&gt;trigger a retry mechanism&lt;/STRONG&gt;.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;This leads to a loop: it tries to read → fails → waits → tries again — keeping the cluster alive unnecessarily.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;Question&lt;/H3&gt;&lt;P&gt;Has anyone else faced this issue? Any idea how to work around it or resolve it?&lt;/P&gt;&lt;P&gt;Thanks in advance!&lt;/P&gt;</description>
    <pubDate>Fri, 12 Sep 2025 14:29:15 GMT</pubDate>
    <dc:creator>LucasAntoniolli</dc:creator>
    <dc:date>2025-09-12T14:29:15Z</dc:date>
    <item>
      <title>Problems with cluster shutdown in DLT</title>
      <link>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131792#M49241</link>
      <description>&lt;P&gt;&lt;STRONG&gt;[Issue] DLT finishes processing, but cluster remains active due to log write error&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Hi everyone, I'm running into a problem with my DLT pipeline and was hoping someone here could help or has experienced something similar.&lt;/P&gt;&lt;H3&gt;Problem Description&lt;/H3&gt;&lt;P&gt;The pipeline completes data processing successfully, but the &lt;STRONG&gt;cluster stays active for a long time&lt;/STRONG&gt;, even though no data is being processed anymore.&lt;/P&gt;&lt;P&gt;After checking the &lt;STRONG&gt;Driver Logs&lt;/STRONG&gt;, I noticed that the system keeps trying to write &lt;STRONG&gt;execution logs and cluster information&lt;/STRONG&gt;, but encounters an error each time. As a result, it retries every minute and ends up stuck in this loop.&lt;/P&gt;&lt;H3&gt;Error Snippet&amp;nbsp;&lt;/H3&gt;&lt;P&gt;09/25/12 11:13:57 ERROR NativeADLGen2RequestComparisonHandler: Error in request comparison&lt;BR /&gt;java.lang.NumberFormatException: For input string: "Fri, 12 Sep 2025 11:13:58 GMT"&lt;BR /&gt;at java.base/java.lang.Long.parseLong(Long.java:711)&lt;BR /&gt;...&lt;BR /&gt;at com.databricks.sql.io.NativeADLGen2RequestComparisonHandler.do Handle(NativeADLGen2RequestComparisonHandler.Scala:94)&amp;nbsp;&lt;/P&gt;&lt;P&gt;It seems that when DLT tries to &lt;STRONG&gt;write to its own event log&lt;/STRONG&gt;, it first attempts to &lt;STRONG&gt;read the current log state&lt;/STRONG&gt; (e.g., Loading version 306944). The bug appears during this &lt;STRONG&gt;read operation&lt;/STRONG&gt;, where it throws a NumberFormatException when parsing a timestamp.&lt;/P&gt;&lt;H3&gt;Observations&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;The error &lt;STRONG&gt;does not crash the pipeline&lt;/STRONG&gt;, but it seems to &lt;STRONG&gt;trigger a retry mechanism&lt;/STRONG&gt;.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;This leads to a loop: it tries to read → fails → waits → tries again — keeping the cluster alive unnecessarily.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;Question&lt;/H3&gt;&lt;P&gt;Has anyone else faced this issue? Any idea how to work around it or resolve it?&lt;/P&gt;&lt;P&gt;Thanks in advance!&lt;/P&gt;</description>
      <pubDate>Fri, 12 Sep 2025 14:29:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131792#M49241</guid>
      <dc:creator>LucasAntoniolli</dc:creator>
      <dc:date>2025-09-12T14:29:15Z</dc:date>
    </item>
    <item>
      <title>Re: Problems with cluster shutdown in DLT</title>
      <link>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131801#M49242</link>
      <description>&lt;P&gt;Here are some quick workarounds that you can try&lt;/P&gt;&lt;P&gt;1.&amp;nbsp;&lt;SPAN&gt;Development mode keeps a cluster warm for rapid iteration. Production mode stops the cluster right after the run finishes. If you must stay in dev mode, tune the&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;pipelines.clusterShutdown.delay&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;so the cluster doesn’t linger. Change the mode for cost savings.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;2.&amp;nbsp;In the driver logs, you’ll see the&amp;nbsp;NumberFormatException&amp;nbsp;repeating roughly every minute even after the pipeline reports “completed”. That’s the smoking gun.&amp;nbsp;&amp;nbsp;If you’re on a recent DBR (e.g., 15.x/16.x), try pinning the pipeline to&amp;nbsp;&lt;STRONG&gt;DBR 14.3 LTS&lt;/STRONG&gt;&amp;nbsp;or, conversely, to the&amp;nbsp;&lt;STRONG&gt;latest LTS&lt;/STRONG&gt;&amp;nbsp;to see if the ADLS client code path differs.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 12 Sep 2025 16:10:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131801#M49242</guid>
      <dc:creator>nayan_wylde</dc:creator>
      <dc:date>2025-09-12T16:10:50Z</dc:date>
    </item>
    <item>
      <title>Re: Problems with cluster shutdown in DLT</title>
      <link>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131803#M49243</link>
      <description>&lt;P&gt;I tested returning the LTS to version 15.4 where the problem was not occurring (current version is 16.4) but in the pipeline it is not accepting to fix the LTS to a previous version, I tried to return it using cluster policies but it automatically pulls the latest version. In the Pipeline in the channel option there are only two options, current and preview, causing the LTS that I put in the policy to be ignored. I also tested putting the LEGACY runtime in the JSON but the DLT no longer accepts this LEGACY parameter.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Sep 2025 16:31:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131803#M49243</guid>
      <dc:creator>LucasAntoniolli</dc:creator>
      <dc:date>2025-09-12T16:31:19Z</dc:date>
    </item>
    <item>
      <title>Re: Problems with cluster shutdown in DLT</title>
      <link>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131815#M49247</link>
      <description>&lt;P&gt;Can you please try one more option.&amp;nbsp;&lt;SPAN&gt;If you’re on&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Preview&lt;/STRONG&gt;&lt;SPAN&gt;, move to&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Current&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;(or vice versa). Sometimes the regression only exists in one channel.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 12 Sep 2025 18:37:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131815#M49247</guid>
      <dc:creator>nayan_wylde</dc:creator>
      <dc:date>2025-09-12T18:37:45Z</dc:date>
    </item>
    <item>
      <title>Re: Problems with cluster shutdown in DLT</title>
      <link>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131866#M49275</link>
      <description>&lt;P&gt;You won't believe it, my friend. I had already tried everything yesterday and everything was fine. I couldn't find the problem until I read your answer mentioning production and development. It made me go to the button, put the DLT into development and then production again right away. To my surprise, the problem of the cluster not shutting down stopped. The funny thing is that it was in production, and only that DLT had a problem. The rest of the others were working normally. I honestly don't know what happened, but it was solved. Thank you very much for your help and answers.&lt;/P&gt;</description>
      <pubDate>Sat, 13 Sep 2025 13:50:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/problems-with-cluster-shutdown-in-dlt/m-p/131866#M49275</guid>
      <dc:creator>LucasAntoniolli</dc:creator>
      <dc:date>2025-09-13T13:50:04Z</dc:date>
    </item>
  </channel>
</rss>

