Relative path in absolute URI when reading a folder with files containing ":" colons in filename
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-11-2023 09:42 AM
I am trying to read a folder with partition files where each partition is date/hour/timestamp.csv where timestamp is the exact timestamp in ISO format, e.g. 09-2022-12-05T20:35:15.2786966Z
It seems like spark having issues with reading files with colons, which is quiet ridiculous.
The issue was raised in 2019 https://issues.apache.org/jira/browse/SPARK-28841 but it seems like the issue remains.
What is the solution, rather than now renaming zillions of file names, which in s3 requires copy...
I am using Spark 3.2.1
- Labels:
-
Relative Path
-
URI
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-11-2023 02:04 PM
The issue was reopened again https://issues.apache.org/jira/browse/HDFS-14762
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-17-2023 01:40 AM
I have renamed the files replacing : with - as the bug still exists

