Malthe
Valued Contributor II

I understand the difference between wall-clock and aggregated time across cores, it's documented under query profiling.

According to the pricing docs, presently a DBU is roughly equivalent to a DS3 v2 instance with 4 cores. What I mean by unaccounted for is that we have a query that runs for a total aggregated time of 1.94h with a wall-clock runtime of 4-5 minutes. The math checks out because that's roughly 24 core hours and matches the expectations of 8 DBUs with the 4-core instance mentioned above.

The problem is that the execution breakdown accounts for just a fraction of that and it purports to be the complete picture. It's basically totally broken for this type of query.

I changed the processing now to first materialize to a staging table, then doing the merge. This reduces the runtime to 40 minutes, but the query details are still lacking any fidelity with the actual aggregated runtime:

Screenshot 2026-03-07 at 08.10.31.png

Observations:

  1. The total time across executions (in the dropdown) doesn't match the total aggregated query time and it also does not match the wall-clock time.
  2. Within each execution, the purported aggregated time spend in the various boxes does not match the reported total execution time. In the screenshot, we see a single task with an aggregated time of 38.34s, but the whole execution is reported to take 35.23s.

Lots of tasks types contribute no metrics at all:

Screenshot 2026-03-07 at 08.14.55.png

In summary, there's basically no transparency here. The total cost of a query simply can't be traced back to its constituents in any meaningful way. That's a significant setback from classic compute where the Spark UI along with basic compute metrics did in fact account for the time spent and the resulting cost.