cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

DeltaFileNotFoundException: No file found in the directory (sudden task failure)

Juju
New Contributor II

Hi all,

I am currently running a job that will upsert a table by reading from delta change data feed from my silver table. Here is the relevent snippet of code:

 

 

rds_changes = spark.read.format("delta") \
  .option("readChangeFeed", "true") \
  .option("startingVersion", 0) \
  .table("main.default.gold_table") \
  .where(f"_commit_timestamp >= '{(datetime.now() - timedelta(hours=1)).strftime('%Y-%m-%d %H:%M:%S')}'")

 

 

Here is the error returned

 

 

com.databricks.sql.transaction.tahoe.DeltaFileNotFoundException: No file found in the directory: s3://databricks-workspace-stack-70da1-metastore-bucket/60ed403c-0a54-4f42-8b8a-73b8cea1bdc3/tables/6d4a9b3d-f88b-436e-be1b-09852f605f4c/_delta_log.

 

 

 I have done the following:

  • Verify that the delta log folder is not deleted by accessing S3 directly
  • Able to query the table directly and perform `DESCRIBE HISTORY gold_table` on it without any issue

Anyone has any idea why this happen when I am running the job which was working fine previously without any changes

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @Juju, The error message you’re encountering, com.databricks.sql.transaction.tahoe.DeltaFileNotFoundException, indicates that the Delta log file is missing in the specified directory. 

 

Let’s explore some potential solutions to address this issue:

 

Check Delta Log Truncation or Deletion:

Spark Configuration Options:

  • Consider the following Spark configuration options:
    • Use a New Checkpoint Directory:
      • Create a new checkpoint directory for your job. However, you mentioned that this might not be feasible due to the need to process existing data.
    • Set spark.sql.files.ignoreMissingFiles to True:
      • This property allows Spark to ignore missing files during processing. It won’t reprocess data from the beginning; instead, it will resume from where the last checkpoint left off.
  • Adjusting these settings may help you avoid data loss while resolving the issue.

Remember to carefully evaluate the impact of any changes on your existing data and processing flow. If possible, test these solutions in a controlled environment to minimize disruptions. Good luck, and I hope this helps you resolve the issue! 🚀

View solution in original post

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @Juju, The error message you’re encountering, com.databricks.sql.transaction.tahoe.DeltaFileNotFoundException, indicates that the Delta log file is missing in the specified directory. 

 

Let’s explore some potential solutions to address this issue:

 

Check Delta Log Truncation or Deletion:

Spark Configuration Options:

  • Consider the following Spark configuration options:
    • Use a New Checkpoint Directory:
      • Create a new checkpoint directory for your job. However, you mentioned that this might not be feasible due to the need to process existing data.
    • Set spark.sql.files.ignoreMissingFiles to True:
      • This property allows Spark to ignore missing files during processing. It won’t reprocess data from the beginning; instead, it will resume from where the last checkpoint left off.
  • Adjusting these settings may help you avoid data loss while resolving the issue.

Remember to carefully evaluate the impact of any changes on your existing data and processing flow. If possible, test these solutions in a controlled environment to minimize disruptions. Good luck, and I hope this helps you resolve the issue! 🚀

Juju
New Contributor II

Hey @Kaniz , found the issue is due to truncated delta log. Thanks for the help man

Kaniz
Community Manager
Community Manager

Hi @Juju , Hi, We value your perspective! It's great to hear that your query has been successfully resolved. Thank you for your contribution.




 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.