cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

DLT Fails with Exception: CANNOT_READ_STREAMING_STATE_FILE

DaPo
New Contributor III

I have several DLT Pipeline, writing to some schema in a unity catalog. The storage location of the unity-catalog is managed by the databricks deployment (on AWS).

The schema and the dlt-pipeline are managed via databricks asset bundles. I did not change any storage location configuration, and used the default metastore.

 For one of the my dlt-tables, I get the following an error message, that it can not read the streaming state file (full message below).  Here are things, I have tried, without success:

  • run `databricks bundle destroy` and then `databricks bundle deploy` again.
  • go to the AWS-console, and delete the checkpoint files manually
  • go to the AWS-console, and delete everything inside the s3-object for the relevant schema
  • double and tripple-checked, that there is no naming conflict for the table. There is not

Has anyone suggestions how to fix this?

Greetings, Daniel

If it helps. I run with the dlt-runtime vs 16.1.1. Here is the full error message:

org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 8e614f5a-cdb7-4942-962d-6cdcee920df7, runId = 8a2f8254-82ab-409d-82a1-2e745cfcbace] terminated with exception: org.apache.spark.SparkException: [CANNOT_LOAD_STATE_STORE.CANNOT_READ_STREAMING_STATE_FILE] An error occurred during loading state. Error reading streaming state file of HDFSStateStoreProvider[id = (op=4,part=0),dir = s3://databricks-workspace-stack-876d9-bucket/unity-catalog/520995832158046/dev/__unitystorage/schemas/07975d9e-97e1-42c8-96a5-a90498e75223/tables/f6fc5371-9617-4cb2-a48b-2f3aee236c1e/_dlt_metadata/checkpoints/***/0/state/4/0]: s3://databricks-workspace-stack-876d9-bucket/unity-catalog/520995832158046/dev/__unitystorage/schemas/07975d9e-97e1-42c8-96a5-a90498e75223/tables/f6fc5371-9617-4cb2-a48b-2f3aee236c1e/_dlt_metadata/checkpoints/***/0/state/4/0/1.delta does not exist. If the stream job is restarted with a new or updated state operation, please create a new checkpoint location or clear the existing checkpoint location. SQLSTATE: 58030 SQLSTATE: XXKST 

As a final remark: I checked. The file As a remark: The state file s3://<...>/checkpoints/***/0/state/4/0/1.delta indeed does not exist. But the following file is there                     s3://<...>/checkpoints/***/0/state/4/1.delta

2 REPLIES 2

mani_22
Databricks Employee
Databricks Employee

Hi @DaPo , Have you made any code changes to your streaming query? There are limitations on what changes in a streaming query are allowed between restarts from the same checkpoint location. Refer this documentation

The checkpoint location appears to be corrupted, as some files are missing. You can try performing a FULL REFRESH on the pipeline.

DaPo
New Contributor III

Hi @mani_22 , the issue was somewhere hidden in my code. (If I remember correctly, the issue was: I was using an internal library, which created a spark-dataframe "on the fly" using spark.createDataframe([some, data]). That dataframe was not backed by a  table in Unity Catalog . The logic worked fine in Batch-Workflows, but not in a streaming-DLT.) My solution was, to save that dataframe as a table, and load that table.