Hello, I want to reopen this issue, since I am facing same error in our production environment and I am not able to solve this and want to ask for help.
Here is the error message I received:
Error while reading file dbfs:/mnt/dynamics/model.json.
Caused by: FileReadException: Error while reading file dbfs:/mnt/dynamics/model.json.
Caused by: InconsistentReadException: The file might have been updated during query execution. Ensure that no pipeline updates existing files during query execution and try again.
Caused by: IOException: Operation failed: "The condition specified using HTTP conditional header(s) is not met.", 412, GET, https://dynamics.dfs.core.windows.net/dataverse/model.json?timeout=90, ConditionNotMet, "The condition specified using HTTP conditional header(s) is not met. RequestId:78fa238b-501f-............ Time:2024-06-06T21:05:33.5168118Z"
Caused by: AbfsRestOperationException: Operation failed: "The condition specified using HTTP conditional header(s) is not met.", 412, GET, https://dynamics.dfs.core.windows.net/dataverse/model.json?timeout=90, ConditionNotMet, "The condition specified using HTTP conditional header(s) is not met. RequestId:78fa238b-............ Time:2024-06-06T21:05:33.5168118Z"
So apparently in the processing of a table, the model.json changes - it changed real-time by the source, so this is expected.
What I want is to just ignore that is changed and dont throw error. Just let it take model.json as is in the time of processing.
I tried to cache dataframe as last suggested, but it doesnt work and still throws same error.
This is what I added:
import org.apache.spark.storage.StorageLevel
val abcd = spark.read.json(somePath/model.json")
abcd.persist(StorageLevel.DISK_ONLY)
Please suggest how to deal with this error.
Thank you.