cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Manual overwrite in s3 console of a collection of parquet files and now we can't read them.

fff_ds
New Contributor
org.apache.spark.SparkException: Job aborted due to stage failure: Task 19 in stage 26.0 failed 4 times, most recent failure: Lost task 19.3 in stage 26.0 (TID 4205, 10.66.225.154, executor 0): com.databricks.sql.io.FileReadException: Error while reading file s3://s3-datascience-prod/redshift/daily/raw/ds/product_information/date=2022-02-05/part-00002-d81f7a47-0421-42a7-9187-f421b0c734b9.c000.snappy.parquet. A file referenced in the transaction log cannot be found. This occurs when data has been manually deleted from the file system rather than using the table `DELETE` statement. For more information, see https://docs.databricks.com/delta/delta-intro.html#frequently-asked-questions

We are trying to copy a file (delta format) with known good properties over several days of known bad properties. We initially did this via S3 CLI but encountered issues with the above error. We tried then copying via dbutils and a databricks notebook over the same file paths and got the same error again. How can we reset this to a state where we don’t encounter the above error & our known good delta files are copied for every date in the date range?

1 REPLY 1

Anonymous
Not applicable

Hello, @Lili Ehrlich​. Welcome! My name is Piper, and I'm a moderator for Databricks. Thank you for bringing your question to us. Let's give it a while for the community to respond first.

Thanks in advance for your patience. 🙂

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.