SteveOstrowski
Databricks Employee
Databricks Employee

Hi @ADBricksExplore,

The sub-statements panel you see in the Query History UI (the child queries spawned by a CALL or multi-statement execution) is rendered from an internal GraphQL endpoint that is not part of the public Databricks API surface. That is why mimicking those browser requests returns inconsistent 400 errors. Those endpoints are undocumented, unsupported for automation, and subject to change without notice, so building on top of them is not recommended.

That said, there are supported approaches you can use to correlate child statement executions with their parent job context.

APPROACH 1: CORRELATE VIA job_task_run_id IN system.query.history

The system.query.history system table (Public Preview) contains a query_source struct with nested job_info fields: job_id, job_run_id, and job_task_run_id. When a CALL statement or stored procedure runs inside a job task, all the individual SQL statements it spawns share the same job_task_run_id. You can use this to group them:

SELECT
statement_id,
statement_type,
statement_text,
start_time,
end_time,
total_duration_ms,
execution_duration_ms,
read_rows,
produced_rows,
read_bytes,
written_bytes,
execution_status
FROM system.query.history
WHERE query_source.job_info.job_task_run_id = '<your_task_run_id>'
ORDER BY start_time

This returns every statement that executed within that specific task run, effectively giving you the parent CALL plus all child statements it triggered. You can further distinguish the parent from the children by filtering on statement_type (the CALL will have statement_type = 'CALL' or similar, while children will be SELECT, INSERT, MERGE, etc.).

APPROACH 2: CORRELATE VIA session_id + TIME WINDOW

If your workload does not run inside a job (for example, interactive notebook or SQL editor execution), you can fall back to correlating by session_id and a time window:

SELECT
statement_id,
statement_type,
statement_text,
start_time,
total_duration_ms,
read_rows,
produced_rows,
execution_status
FROM system.query.history
WHERE session_id = '<your_session_id>'
AND start_time BETWEEN '<parent_start>' AND '<parent_end>'
ORDER BY start_time

The session_id ties all statements from the same Spark session together, and the time window scopes it to the execution window of the parent statement.

APPROACH 3: REST API WITH FILTERS

The Query History REST API (GET /api/2.0/sql/history/queries) also supports filtering by time range and returns metrics like duration, rows read/produced, bytes read/written, and execution status. You can also use the Databricks CLI:

databricks query-history list --include-metrics --filter-by '{"statuses":["FINISHED"]}'

However, note that this endpoint does not expose a parent_statement_id or explicit parent-child linking either. You would still need to correlate by time window or by matching the warehouse_id and session context.

APPROACH 4: CUSTOM TAGGING WITH query_tags

If you control the code that runs inside your CALL statements, you can add custom query tags to each child query. This makes correlation straightforward regardless of execution context:

SET query_tags = map('parent_call', 'my_procedure_v1', 'run_id', '<unique_run_id>');
-- your child queries here

Then query:

SELECT *
FROM system.query.history
WHERE query_tags['parent_call'] = 'my_procedure_v1'
AND query_tags['run_id'] = '<unique_run_id>'

WHAT IS NOT AVAILABLE TODAY

- There is no parent_statement_id column in system.query.history that directly links a child statement to its parent CALL statement.
- The operator-level DAG (the query profile visualization) is not available via any public API. The GraphQL endpoints that power the UI are internal only.
- SparkUI logs are not accessible on serverless compute, as you noted.

For ongoing monitoring, the recommended pattern is to build an incremental ingestion pipeline that queries system.query.history using the update_time column as a watermark. This gives you a durable, queryable history of all statement executions with their job context and metrics.

REFERENCES

- System tables: Query history

https://docs.databricks.com/aws/en/admin/system-tables/query-history

- Monitor query history with the UI

https://docs.databricks.com/aws/en/sql/admin/query-history

- Query History REST API

https://docs.databricks.com/api/workspace/queryhistory

- Query Profile

https://docs.databricks.com/aws/en/sql/admin/query-profile

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.

If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.