07-28-2025 06:45 AM
I'm encountering the following error while trying to upload a RocksDB checkpoint in Databricks:
java.lang.IllegalStateException: Found no SST files during uploading RocksDB checkpoint version 498 with 2332 key(s).
at com.databricks.sql.streaming.state.RocksDBFileManager.verifyImmutableFiles(RocksDBFileManager.scala:620)
at com.databricks.sql.streaming.state.RocksDBFileManager.saveCheckpointToDbfs(RocksDBFileManager.scala:173)
at com.databricks.sql.rocksdb.CloudRocksDB.$anonfun$sync$7(CloudRocksDB.scala:235)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:668)
at com.databricks.sql.rocksdb.CloudRocksDB.timeTakenMs(CloudRocksDB.scala:634)
at com.databricks.sql.rocksdb.CloudRocksDB.$anonfun$sync$1(CloudRocksDB.scala:234)
at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:395)
Background:
I am using Databricks with Spark and RocksDB for stateful streaming. This error occurs when Spark attempts to upload a RocksDB checkpoint, and the system reports that no SST files were found.
What could be causing this error and why are no SST files being found during the upload process?
Are there any specific configurations or setups I might be missing for properly handling RocksDB checkpoints in Databricks?
What potential solutions or workarounds exist for this issue?
We are using all default Spark runtime configurations.
SPARK Version: 2.x
Current Work-around:
Checkpoint needs to be deleted from s3 and retrigger of the streaming pipeline is fixed
07-28-2025 01:23 PM
@susmitsircar It’s possible the keys in the partition are too old (more than 7 days) and TTLed, which results in no SST files. The RocksDB runs a background compaction to clean up stale data. What is the DBR version you are using?
There are some heuristic checks performed on RocksDB files before uploading. You can set the below configs to turn off the checks
spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows false spark.databricks.rocksDB.verifyBeforeUpload false
07-28-2025 01:23 PM
@susmitsircar It’s possible the keys in the partition are too old (more than 7 days) and TTLed, which results in no SST files. The RocksDB runs a background compaction to clean up stale data. What is the DBR version you are using?
There are some heuristic checks performed on RocksDB files before uploading. You can set the below configs to turn off the checks
spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows false spark.databricks.rocksDB.verifyBeforeUpload false
07-29-2025 01:37 PM
More context:
This error happens mostly when we restart the job run and it happens randomly for 1 or 2 datasets. Restarting done via databricks API
07-29-2025 01:27 AM
Thanks for your reply, and it's a super catch, now I am able to connect the dots.
I’ve confirmed that the DBR version is 9.1. I wanted to discuss some configurations related to RocksDB and performance optimizations. Specifically, I am considering adjusting the following settings:
spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows false
This is currently set to true by default, but I am contemplating turning it off for better performance, especially since tracking the number of rows in the state store adds overhead on write operations. From your experience, is disabling this setting beneficial in terms of performance, especially when dealing with large state sizes? I understand that turning it off will report numTotalStateRows as 0, but it should help improve throughput.
spark.databricks.rocksDB.verifyBeforeUpload false
This I didn't found any documentation (not public made by databricks), so keen to know how it's going to impact, and what it does.
07-30-2025 03:51 AM
@mani_22 any help on the above question?
It should be safe but being very sure
07-30-2025 04:52 AM
Hi @susmitsircar, spark.databricks.rocksDB.verifyBeforeUpload config determines whether a verification check should be conducted prior to uploading data to RocksDB. The default value is true. Since the SST files are lost, disabling the above config will help to bypass the error.
Regarding the other config, spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows, setting it to false causes Spark to skip tracking the number of rows, which can result in faster write operations and improved performance, especially in high-volume or heavily stateful streaming workloads.
Hope this helps!
07-30-2025 05:12 AM
yeah I was aware of this one
spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows
https://spark.apache.org/docs/3.5.0/structured-streaming-programming-guide.html#performance-aspect-c...
Thanks for the clarification on the
spark.databricks.rocksDB.verifyBeforeUpload false
We will try this out marking it as a Solution for now. Thanks.
3 weeks ago
@mani_22 Do you see any risk of disabling this flag in our pipeline, as we will be bypassing some heuristic checks, as far as i understand, while uploading the state files
spark.databricks.rocksDB.verifyBeforeUpload false
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now