Databricks Community

mriccardi · ‎06-02-2023

Hi Everyone!

Today 4 streaming jobs started to fail out of nowhere due to: StreamingQueryException: [STREAM_FAILED] Query [id = ####, runId = ####] terminated with exception: dbfs:/mnt/path/my_table/sources/0/0 doesn't exist (latestId: 8, compactInterval: 10).

These streamings have been on for about +1 year.
The only change we did was in March we added one more column to the schema.
These streamings point to S3 and load parquet data, the run once daily.
To keep track of files loaded we have a checkpoint path defined for each table.

What we found:

When I go to path sources/0 the file 0 does not exists.
We find the file 711 that was created the 23 of May.
For some reason the 24 of May the streaming failed to get the latest batchId state and restarted the batchId to 0, also it stopped to write files in the sources, offset, and commits folder of the checkpoint location.

root cause:

I understand that the issue is that for some reason spark streamming lost the last state of the checkpoint + stopped logging the checkpoint.

Anyone has experienced something like this? How do you manage to recover without processing all the files again?

Thanks in advance!

Vartika · ‎06-09-2023

Hi @Martin Riccardi,

We haven't heard from you since the last response from @Kaniz Fatma , and I was checking back to see if her suggestions helped you.

Or else, If you have any solution, please share it with the community, as it can be helpful to others.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Thanks!

Databricks Community

Spark Streaming: Checkpoint corrupted

Connect with Databricks Users in Your Area

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences

Databricks Community Champion - December 2024 - Sujesh Menon