cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to read file from S3

LearningDatabri
Contributor II

I tried to read a file from S3, but facing the below error:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 53.0 failed 4 times, most recent failure: Lost task 0.3 in stage 53.0 (TID 82, xx.xx.xx.xx, executor 0): com.databricks.sql.io.FileReadException: Error while reading file s3://<mybucket>/<path>/file.csv.

I used:

spark.read.options(delimiter = '|').option("header", False).csv('s3://<mybucket>/<path>/file.csv')

1 ACCEPTED SOLUTION

Accepted Solutions

Prabakar
Esteemed Contributor III

I remember I have seen such an issue before. Please check the S3 life cycle management. If the object is migrated to another storage class (mostly archived) then there are possibilities for this error.

View solution in original post

7 REPLIES 7

Prabakar
Esteemed Contributor III

Have you validated if the file exists or not?

Is this happening with all files are specific files?

The file exists. Few files are not working and few are working.

Prabakar
Esteemed Contributor III

I remember I have seen such an issue before. Please check the S3 life cycle management. If the object is migrated to another storage class (mostly archived) then there are possibilities for this error.

Thanks @Prabakar Ammeappin​  for this information. We have life cycle management set. The files with this error were not used for some time and were Archived. I wonder why files were moved to Glacier in 60 days. Have to revisit the lifecycle rules and change it.

Now it makes sense why I got the error for some files and not for the others.

Sivaprasad1
Valued Contributor II

Which DBR version are you using? Could you please test it with a different DBR version probably DBR 9.x?

I tried with all the available LTS versions.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group