I have a workflow with 11 tasks (each task executes one notebook) that run in sequence. The task was run on 9/1 and again today (9/10). I am working reporting task history and status using system table `
system.lakeflow.job_task_run_timeline`.
The state of the notebooks is such that all tasks succeed except for #7, which fails on each of its three attempts.
When I query the system table for records where job_run_id matches the value of the run from 9/1, I get 13 records (10 from the tasks that succeed, 3 for the attempts at task #7). All is well.
When I query the system table for records where job_run_id matches the value of the run from 9/10, I get only 7 records: successes on tasks 1-6, and only the first attempt at task #7. Attempts 2 and 3 of task #7 are missing from the table, as are tasks #8-11. I emphasize, however, that in the Workflows screens, it's utterly plain to see that all 13 task attempts (10 successful, 3 failed) were executed. I can easily click into the fully executed notebook instance for task #9, for example.
This is not good, I think. Where are the records for the remaining task runs of 9/10? What was it about attempt 1 of task #7 that short-circuited the rest of the recording of tasks?
Footnote: the only difference between the conditions for 9/1 vs. the conditions for 9/10 is that 9/1 was run from a trigger, while 9/10 was run manually by clicking "Run Now".