<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Spark streaming failing intermittently with llegalStateException: Found no SST files in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126861#M47786</link>
    <description>&lt;P&gt;&lt;STRONG&gt;More context:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;This error happens mostly when we restart the job run and it happens randomly for 1 or 2 datasets. Restarting done via databricks API&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 29 Jul 2025 20:37:59 GMT</pubDate>
    <dc:creator>susmitsircar</dc:creator>
    <dc:date>2025-07-29T20:37:59Z</dc:date>
    <item>
      <title>Spark streaming failing intermittently with llegalStateException: Found no SST files</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126705#M47744</link>
      <description>&lt;P&gt;I'm encountering the following error while trying to upload a RocksDB checkpoint in Databricks:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;java.lang.IllegalStateException: Found no SST files during uploading RocksDB checkpoint version 498 with 2332 key(s).
    at com.databricks.sql.streaming.state.RocksDBFileManager.verifyImmutableFiles(RocksDBFileManager.scala:620)
    at com.databricks.sql.streaming.state.RocksDBFileManager.saveCheckpointToDbfs(RocksDBFileManager.scala:173)
    at com.databricks.sql.rocksdb.CloudRocksDB.$anonfun$sync$7(CloudRocksDB.scala:235)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:668)
    at com.databricks.sql.rocksdb.CloudRocksDB.timeTakenMs(CloudRocksDB.scala:634)
    at com.databricks.sql.rocksdb.CloudRocksDB.$anonfun$sync$1(CloudRocksDB.scala:234)
    at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
    at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:395)&lt;/LI-CODE&gt;&lt;P&gt;Background:&lt;/P&gt;&lt;P&gt;I am using Databricks with Spark and RocksDB for stateful streaming. This error occurs when Spark attempts to upload a RocksDB checkpoint, and the system reports that no SST files were found.&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;What could be causing this error and why are no SST files being found during the upload process?&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Are there any specific configurations or setups I might be missing for properly handling RocksDB checkpoints in Databricks?&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;What potential solutions or workarounds exist for this issue?&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;We are using all default Spark runtime configurations.&lt;/P&gt;&lt;P&gt;SPARK Version: 2.x&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Current Work-around:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Checkpoint needs to be deleted from s3 and retrigger of the streaming pipeline is fixed&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jul 2025 13:45:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126705#M47744</guid>
      <dc:creator>susmitsircar</dc:creator>
      <dc:date>2025-07-28T13:45:31Z</dc:date>
    </item>
    <item>
      <title>Re: Spark streaming failing intermittently with llegalStateException: Found no SST files</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126741#M47759</link>
      <description>&lt;P class="p1"&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/176996"&gt;@susmitsircar&lt;/a&gt;&amp;nbsp;It’s possible the keys in the partition are too old (more than 7 days) and TTLed, which results in no SST files.&lt;SPAN class="Apple-converted-space"&gt;&amp;nbsp;&lt;/SPAN&gt;The RocksDB runs a background compaction to clean up stale data.&amp;nbsp;&lt;SPAN class="Apple-converted-space"&gt;What is the DBR version you are using?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="p1"&gt;There are some heuristic checks performed on RocksDB files before uploading. You can set the below configs to turn off the checks&lt;/P&gt;
&lt;P class="p1"&gt;spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows false spark.databricks.rocksDB.verifyBeforeUpload&lt;SPAN class="s1"&gt;&lt;STRONG&gt;&amp;nbsp;&lt;/STRONG&gt;&lt;/SPAN&gt;false&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jul 2025 20:23:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126741#M47759</guid>
      <dc:creator>mani_22</dc:creator>
      <dc:date>2025-07-28T20:23:34Z</dc:date>
    </item>
    <item>
      <title>Re: Spark streaming failing intermittently with llegalStateException: Found no SST files</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126771#M47768</link>
      <description>&lt;P&gt;Thanks for your reply, and it's a super catch, now I am able to connect the dots.&lt;/P&gt;&lt;P&gt;I’ve confirmed that the DBR version is 9.1. I wanted to discuss some configurations related to RocksDB and performance optimizations. Specifically, I am considering adjusting the following settings:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows false​&lt;/LI-CODE&gt;&lt;P&gt;This is currently set to true by default, but I am contemplating turning it off for better performance, especially since tracking the number of rows in the state store adds overhead on write operations. From your experience, is disabling this setting beneficial in terms of performance, especially when dealing with large state sizes? I understand that turning it off will report numTotalStateRows as 0, but it should help improve throughput.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;spark.databricks.rocksDB.verifyBeforeUpload false&lt;/LI-CODE&gt;&lt;P&gt;This I didn't found any documentation (not public made by databricks), so keen to know how it's going to impact, and what it does.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN&gt;Given that the background compaction cleans up stale data (especially if keys are TTL-ed), I want to make sure I’m optimizing the system appropriately for performance, as well as solving this issue. Would love to hear your thoughts on the trade-offs involved here.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 29 Jul 2025 08:27:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126771#M47768</guid>
      <dc:creator>susmitsircar</dc:creator>
      <dc:date>2025-07-29T08:27:03Z</dc:date>
    </item>
    <item>
      <title>Re: Spark streaming failing intermittently with llegalStateException: Found no SST files</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126861#M47786</link>
      <description>&lt;P&gt;&lt;STRONG&gt;More context:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;This error happens mostly when we restart the job run and it happens randomly for 1 or 2 datasets. Restarting done via databricks API&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 29 Jul 2025 20:37:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126861#M47786</guid>
      <dc:creator>susmitsircar</dc:creator>
      <dc:date>2025-07-29T20:37:59Z</dc:date>
    </item>
    <item>
      <title>Re: Spark streaming failing intermittently with llegalStateException: Found no SST files</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126928#M47804</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/89888"&gt;@mani_22&lt;/a&gt;&amp;nbsp;any help on the above question?&lt;BR /&gt;It should be safe but being very sure&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jul 2025 10:51:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126928#M47804</guid>
      <dc:creator>susmitsircar</dc:creator>
      <dc:date>2025-07-30T10:51:49Z</dc:date>
    </item>
    <item>
      <title>Re: Spark streaming failing intermittently with llegalStateException: Found no SST files</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126934#M47806</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/176996"&gt;@susmitsircar&lt;/a&gt;,&amp;nbsp;spark.databricks.rocksDB.verifyBeforeUpload config determines whether a verification check should be conducted prior to uploading data to RocksDB. The default value is true. Since the SST files are lost, disabling the above config will help to bypass the error.&lt;/P&gt;
&lt;P class="p1"&gt;Regarding the other config, spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows, setting it to false causes Spark to skip&amp;nbsp;tracking the number of rows, which can result in faster write operations and improved performance, especially in high-volume or heavily stateful streaming workloads.&lt;/P&gt;
&lt;P class="p1"&gt;Hope this helps!&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jul 2025 11:52:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126934#M47806</guid>
      <dc:creator>mani_22</dc:creator>
      <dc:date>2025-07-30T11:52:56Z</dc:date>
    </item>
    <item>
      <title>Re: Spark streaming failing intermittently with llegalStateException: Found no SST files</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126938#M47809</link>
      <description>&lt;P&gt;yeah I was aware of this one&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows&lt;BR /&gt;&lt;BR /&gt;&lt;A href="https://spark.apache.org/docs/3.5.0/structured-streaming-programming-guide.html#performance-aspect-considerations" target="_blank" rel="noopener"&gt;https://spark.apache.org/docs/3.5.0/structured-streaming-programming-guide.html#performance-aspect-considerations&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;Thanks for the clarification on the&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;spark.databricks.rocksDB.verifyBeforeUpload false&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;We will try this out marking it as a Solution for now. Thanks.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jul 2025 12:12:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/126938#M47809</guid>
      <dc:creator>susmitsircar</dc:creator>
      <dc:date>2025-07-30T12:12:48Z</dc:date>
    </item>
    <item>
      <title>Re: Spark streaming failing intermittently with llegalStateException: Found no SST files</title>
      <link>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/128261#M48191</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/89888"&gt;@mani_22&lt;/a&gt;&amp;nbsp;Do you see any risk of disabling this flag in our pipeline, as we will be bypassing some&amp;nbsp;&lt;SPAN&gt;heuristic&lt;/SPAN&gt; checks, as far as i understand, while uploading the state files&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;spark.databricks.rocksDB.verifyBeforeUpload false&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Aug 2025 16:47:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/spark-streaming-failing-intermittently-with-llegalstateexception/m-p/128261#M48191</guid>
      <dc:creator>susmitsircar</dc:creator>
      <dc:date>2025-08-12T16:47:16Z</dc:date>
    </item>
  </channel>
</rss>

