cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Error writing parquet files

JEAG
New Contributor III

Hi, we are having this chain of errors every day in different files and processes:

An error occurred while calling o11255.parquet.

: org.apache.spark.SparkException: Job aborted.

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 982.0 failed 4 times, most recent failure: Lost task 0.3 in stage 982.0 (TID 85705, 172.20.45.5, executor 31): org.apache.spark.SparkException: Task failed while writing rows.

Caused by: com.databricks.sql.io.FileReadException: Error while reading file dbfs: ... It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved.

Caused by: shaded.parquet.org.apache.thrift.transport.TTransportException: java.io.IOException: Stream is closed!

Caused by: java.io.IOException: Stream is closed!

Caused by: java.io.FileNotFoundException: dbfs:/...

Now, we fix it deleting the file and running again the job, but we donยดt know how to avoid the error

Any idea?

Thxs

15 REPLIES 15

Kolana
New Contributor II

Hi

Even I am facing this issue now

Did you identified the fix?