Yes no problem.
I have a python program, called "post ingestion", that run on a databricks job cluster during the night and consist of :
- inserting data to a deltalake table
- executing an optimize command on that table
- executing a vacuum command on that table
- And then I use dbutils command to copy the folder containing data of this delta table to another folder (I dispatch data for a lab and a qal databricks workspace)
Sometimes the copy failed with a 404 error :
java.io.FileNotFoundException: Operation failed: 'The specified path does not exist.', 404, GET, https://icmfcprddls001.dfs.core.windows.net/prd/curated/common/AS400/AS400.BC300.FT3RCPV/_delta_log/..., PathNotFound, 'The specified path does not exist. RequestId:e31cc75e-b01f-0042-1858-c23924000000 Time:2022-09-07T01:26:06.8939739Z
When the error occurs, no other program is using the delta table
At the morning when I re-run the copy, everything run fine.
They ask me to add this parameter to get more detail about sparks operations. But I want to know exactly what this parameter do, and MS was not able to give me more informations