DeltaFileNotFoundException: No file found in the directory (sudden task failure)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-19-2023 06:29 PM
Hi all,
I am currently running a job that will upsert a table by reading from delta change data feed from my silver table. Here is the relevent snippet of code:
rds_changes = spark.read.format("delta") \
.option("readChangeFeed", "true") \
.option("startingVersion", 0) \
.table("main.default.gold_table") \
.where(f"_commit_timestamp >= '{(datetime.now() - timedelta(hours=1)).strftime('%Y-%m-%d %H:%M:%S')}'")
Here is the error returned
com.databricks.sql.transaction.tahoe.DeltaFileNotFoundException: No file found in the directory: s3://databricks-workspace-stack-70da1-metastore-bucket/60ed403c-0a54-4f42-8b8a-73b8cea1bdc3/tables/6d4a9b3d-f88b-436e-be1b-09852f605f4c/_delta_log.
I have done the following:
- Verify that the delta log folder is not deleted by accessing S3 directly
- Able to query the table directly and perform `DESCRIBE HISTORY gold_table` on it without any issue
Anyone has any idea why this happen when I am running the job which was working fine previously without any changes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-20-2023 12:59 AM
Hey @Retired_mod , found the issue is due to truncated delta log. Thanks for the help man
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2024 01:39 PM
What was the fix?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-14-2024 01:42 AM
1) check the first delta feed enabled version in
DESCRIBE HISTORY `table_name`;
2) use this version instead of 0 in .option("startingVersion", x)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-06-2025 06:19 AM
Can you please describe some more on option 2)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-06-2025 07:41 AM
You would run DESCRIBE HISTORY `table_name`; to check which versions are available. If the delta log is truncated for some reason, you will not find a version 0. Use the oldest version you can find instead of 0. For example, if the oldest version you can find in the delta log is 10, use .option("startingVersion", 10).