- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-03-2023 07:17 AM
Hello to everyone. We filed a support ticket with Databricks. This is the response I received, along with an interim solution to the problem. I hope it is useful to those who read it.
Problem Statement:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 7.0 failed 4 times, most recent failure: Lost task 1.3 in stage 7.0 (TID 24) (10.150.38.137 executor 0): java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: 2022-04-27T20:09:00 (Attached the complete error message)
Root Cause Analysis:
We have an incremental listing mode to speedup the listing by not scanning prefixes we saw before but the incremental listing mode does not like file names with certain special char in it for example:
and if you upload a file with a special character into DBFS file gets renamed and automatically replaced by _
This is currently in our roadmap but don't have an exact ETA.
Solution:
use the below config to mitigate the issue
cloudFiles.useIncrementalListing to false.