cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

DLT Fails with Exception: CANNOT_READ_STREAMING_STATE_FILE

DaPo
New Contributor II

I have several DLT Pipeline, writing to some schema in a unity catalog. The storage location of the unity-catalog is managed by the databricks deployment (on AWS).

The schema and the dlt-pipeline are managed via databricks asset bundles. I did not change any storage location configuration, and used the default metastore.

 For one of the my dlt-tables, I get the following an error message, that it can not read the streaming state file (full message below).  Here are things, I have tried, without success:

  • run `databricks bundle destroy` and then `databricks bundle deploy` again.
  • go to the AWS-console, and delete the checkpoint files manually
  • go to the AWS-console, and delete everything inside the s3-object for the relevant schema
  • double and tripple-checked, that there is no naming conflict for the table. There is not

Has anyone suggestions how to fix this?

Greetings, Daniel

If it helps. I run with the dlt-runtime vs 16.1.1. Here is the full error message:

org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 8e614f5a-cdb7-4942-962d-6cdcee920df7, runId = 8a2f8254-82ab-409d-82a1-2e745cfcbace] terminated with exception: org.apache.spark.SparkException: [CANNOT_LOAD_STATE_STORE.CANNOT_READ_STREAMING_STATE_FILE] An error occurred during loading state. Error reading streaming state file of HDFSStateStoreProvider[id = (op=4,part=0),dir = s3://databricks-workspace-stack-876d9-bucket/unity-catalog/520995832158046/dev/__unitystorage/schemas/07975d9e-97e1-42c8-96a5-a90498e75223/tables/f6fc5371-9617-4cb2-a48b-2f3aee236c1e/_dlt_metadata/checkpoints/***/0/state/4/0]: s3://databricks-workspace-stack-876d9-bucket/unity-catalog/520995832158046/dev/__unitystorage/schemas/07975d9e-97e1-42c8-96a5-a90498e75223/tables/f6fc5371-9617-4cb2-a48b-2f3aee236c1e/_dlt_metadata/checkpoints/***/0/state/4/0/1.delta does not exist. If the stream job is restarted with a new or updated state operation, please create a new checkpoint location or clear the existing checkpoint location. SQLSTATE: 58030 SQLSTATE: XXKST 

As a final remark: I checked. The file As a remark: The state file s3://<...>/checkpoints/***/0/state/4/0/1.delta indeed does not exist. But the following file is there                     s3://<...>/checkpoints/***/0/state/4/1.delta

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group