cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks streaming job issue with Autoloader for new checkpoint.

Himanshi
New Contributor III

Hi Team,

I am trying to run a streaming job in databricks, used Autoloader approach for reading the files from the Azure Datalake Gen2 which is in parquet format. I have created a new checkpoint, so first offset is getting created but throwing an error that : "py4j.Py4JException: An exception was raised by the Python Proxy. Return Message: Traceback (most recent call last):"

I have opened that error within that I got :

"py4j.protocol.Py4JJavaError: An error occurred while calling o2990.save. : org.apache.spark.SparkException: Job aborted." , "Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in stage 11.0 failed 4 times, most recent failure: Lost task 5.3 in stage 11.0 (TID 115) (172.20.58.133 executor 1): com.databricks.sql.io.FileReadException: Error while reading file /mnt/adl2/kind=data/evolution=2/file_format=parquet/ingestion_date=2022/08/03/13/-13abc.parquet."

"Caused by: java.lang.AssertionError: assertion failed"

What could be the reason, please provide the solution.

0 REPLIES 0
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.