Getting FileNotFoundException while using cloudFiles
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2023 04:14 AM
Hi,
Following is the code i am using the ingest the data incrementally (weekly).
val ssdf = spark.readStream.schema(schema)
.format("cloudFiles")
.option("cloudFiles.format", "parquet")
.load(sourceUrl)
.filter(criteriaFilter)
val transformedDf = ssdf.transform(.....)
val transformedDf = ssdf.transform(.....)
val processData = transformedDf
.select(recordFields: _*)
.writeStream
.option("checkpointLocation", outputUrl + "checkpoint/")
.format("parquet")
.outputMode("append")
.option("path", outputUrl + run_id + "/")
.trigger(Trigger.Once())
.start()
processData.processAllAvailable()
processData.stop()
For each week, the data is written to a new folder and checkpoint to the same folder.
This worked fine for 3 to 5 incremental run.
But recently i got the following error :
ERROR: Query termination received for [id=2345245425], with exception: org.apache.spark.SparkException: Job aborted.
Caused by: java.io.FileNotFoundException: Unable to find batch s3://outputPath/20230810063959/_spark_metadata/0
What is the reason for this issue ? Any idea?
For each week, the data is written to a new folder and checkpoint to the same folder.
This worked fine for 3 to 5 incremental run.
But recently i got the following error :
ERROR: Query termination received for [id=2345245425], with exception: org.apache.spark.SparkException: Job aborted.
Caused by: java.io.FileNotFoundException: Unable to find batch s3://outputPath/20230810063959/_spark_metadata/0
What is the reason for this issue ? Any idea?
Labels:
- Labels:
-
CloudFiles
-
Filenotfoundexception
2 REPLIES 2
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2023 06:05 AM
Danny is another process mutating / deleting the incoming files?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2023 06:08 AM
New files gets added to the input location. Input files are not deleted or updated ..
![](/skins/images/B38AF44D4BD6CE643D2A527BE673CCF6/responsive_peak/images/icon_anonymous_message.png)
![](/skins/images/B38AF44D4BD6CE643D2A527BE673CCF6/responsive_peak/images/icon_anonymous_message.png)