cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

pyspark streaming failed for now reason

patojo94
New Contributor II

Hi everyone, I have a pyspark streaming reading from an aws kinesis that suddenly failed for no reason (I mean, we did not make any changes in the last time).

It is giving the following error:

ERROR MicroBatchExecution: Query kinesis_events_prod_bronze [id = 06233cfc-e27d-410d-858b-7c2546c5004f, runId = ace41ec4-c18b-421f-9e5b-bf5f75c96b12] terminated with error
java.lang.IllegalStateException: The transaction log has failed integrity checks. We recommend you contact Databricks support for assistance. To disable this check, set spark.databricks.delta.state.corruptionIsFatal to false

Do you have any idea of what could have happened or how to fix it?

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

@patricio tojo​ , It just seems that some record coming from AWS kinesis is corrupted. I think you can debug it on Kinesis side.

View solution in original post

5 REPLIES 5

Hubert-Dudek
Esteemed Contributor III

@patricio tojo​ , It just seems that some record coming from AWS kinesis is corrupted. I think you can debug it on Kinesis side.

Kaniz
Community Manager
Community Manager

Hi @patricio tojo​  , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer)​ 's response help you to find the solution? Please let us know.

jose_gonzalez
Moderator
Moderator

Hi @patricio tojo​,

Did you increase/reduce your Kinesis shards? or did you remove your checkpoint?

Kaniz
Community Manager
Community Manager

Hi @patricio tojo​ , We haven’t heard from you on the last response from @Jose Gonzalez​ , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.

jcasanella
New Contributor III

@patricio tojo​ I've the same problem, however in my case is after migrating into unity catalog. Need to investigate a little more but adding this to my spark job, it works:

spark.conf.set("spark.databricks.delta.state.corruptionIsFatal", False)

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.