org.apache.spark.SparkException: Job aborted due to stage failure: Task 19 in stage 26.0 failed 4 times, most recent failure: Lost task 19.3 in stage 26.0 (TID 4205, 10.66.225.154, executor 0): com.databricks.sql.io.FileReadException: Error while reading file s3://s3-datascience-prod/redshift/daily/raw/ds/product_information/date=2022-02-05/part-00002-d81f7a47-0421-42a7-9187-f421b0c734b9.c000.snappy.parquet. A file referenced in the transaction log cannot be found. This occurs when data has been manually deleted from the file system rather than using the table `DELETE` statement. For more information, see https://docs.databricks.com/delta/delta-intro.html#frequently-asked-questions
We are trying to copy a file (delta format) with known good properties over several days of known bad properties. We initially did this via S3 CLI but encountered issues with the above error. We tried then copying via dbutils and a databricks notebook over the same file paths and got the same error again. How can we reset this to a state where we don’t encounter the above error & our known good delta files are copied for every date in the date range?