cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Spark streaming failing intermittently with llegalStateException: Found no SST files

susmitsircar
New Contributor III

I'm encountering the following error while trying to upload a RocksDB checkpoint in Databricks:

java.lang.IllegalStateException: Found no SST files during uploading RocksDB checkpoint version 498 with 2332 key(s).
    at com.databricks.sql.streaming.state.RocksDBFileManager.verifyImmutableFiles(RocksDBFileManager.scala:620)
    at com.databricks.sql.streaming.state.RocksDBFileManager.saveCheckpointToDbfs(RocksDBFileManager.scala:173)
    at com.databricks.sql.rocksdb.CloudRocksDB.$anonfun$sync$7(CloudRocksDB.scala:235)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:668)
    at com.databricks.sql.rocksdb.CloudRocksDB.timeTakenMs(CloudRocksDB.scala:634)
    at com.databricks.sql.rocksdb.CloudRocksDB.$anonfun$sync$1(CloudRocksDB.scala:234)
    at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
    at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:395)

Background:

I am using Databricks with Spark and RocksDB for stateful streaming. This error occurs when Spark attempts to upload a RocksDB checkpoint, and the system reports that no SST files were found.

  • What could be causing this error and why are no SST files being found during the upload process?

  • Are there any specific configurations or setups I might be missing for properly handling RocksDB checkpoints in Databricks?

  • What potential solutions or workarounds exist for this issue?

We are using all default Spark runtime configurations.

SPARK Version: 2.x

Current Work-around:

Checkpoint needs to be deleted from s3 and retrigger of the streaming pipeline is fixed

1 ACCEPTED SOLUTION

Accepted Solutions

mani_22
Databricks Employee
Databricks Employee

@susmitsircar It’s possible the keys in the partition are too old (more than 7 days) and TTLed, which results in no SST files. The RocksDB runs a background compaction to clean up stale data. What is the DBR version you are using?

There are some heuristic checks performed on RocksDB files before uploading. You can set the below configs to turn off the checks

spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows false spark.databricks.rocksDB.verifyBeforeUpload false

View solution in original post

7 REPLIES 7

mani_22
Databricks Employee
Databricks Employee

@susmitsircar It’s possible the keys in the partition are too old (more than 7 days) and TTLed, which results in no SST files. The RocksDB runs a background compaction to clean up stale data. What is the DBR version you are using?

There are some heuristic checks performed on RocksDB files before uploading. You can set the below configs to turn off the checks

spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows false spark.databricks.rocksDB.verifyBeforeUpload false

susmitsircar
New Contributor III

More context:

This error happens mostly when we restart the job run and it happens randomly for 1 or 2 datasets. Restarting done via databricks API 

susmitsircar
New Contributor III

Thanks for your reply, and it's a super catch, now I am able to connect the dots.

I’ve confirmed that the DBR version is 9.1. I wanted to discuss some configurations related to RocksDB and performance optimizations. Specifically, I am considering adjusting the following settings:

spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows false​

This is currently set to true by default, but I am contemplating turning it off for better performance, especially since tracking the number of rows in the state store adds overhead on write operations. From your experience, is disabling this setting beneficial in terms of performance, especially when dealing with large state sizes? I understand that turning it off will report numTotalStateRows as 0, but it should help improve throughput.

spark.databricks.rocksDB.verifyBeforeUpload false

This I didn't found any documentation (not public made by databricks), so keen to know how it's going to impact, and what it does.

 

Given that the background compaction cleans up stale data (especially if keys are TTL-ed), I want to make sure I’m optimizing the system appropriately for performance, as well as solving this issue. Would love to hear your thoughts on the trade-offs involved here.

@mani_22 any help on the above question?
It should be safe but being very sure

mani_22
Databricks Employee
Databricks Employee

Hi @susmitsircar, spark.databricks.rocksDB.verifyBeforeUpload config determines whether a verification check should be conducted prior to uploading data to RocksDB. The default value is true. Since the SST files are lost, disabling the above config will help to bypass the error.

Regarding the other config, spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows, setting it to false causes Spark to skip tracking the number of rows, which can result in faster write operations and improved performance, especially in high-volume or heavily stateful streaming workloads.

Hope this helps!

susmitsircar
New Contributor III

yeah I was aware of this one

spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows

https://spark.apache.org/docs/3.5.0/structured-streaming-programming-guide.html#performance-aspect-c...

Thanks for the clarification on the 

spark.databricks.rocksDB.verifyBeforeUpload false

 We will try this out marking it as a Solution for now. Thanks.

susmitsircar
New Contributor III

@mani_22 Do you see any risk of disabling this flag in our pipeline, as we will be bypassing some heuristic checks, as far as i understand, while uploading the state files

spark.databricks.rocksDB.verifyBeforeUpload false

 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now