Hi,
I cannot find so far a way to get programmatically (SQL/Python) the Subqueries(/Sub-statements) executions history records, shown in ADBricks UI Query History/Profile, that were executed during a TaskRun of Job, as shown in [red boxes] on the attached screenshot:
Subqueries-ADBricks-UI
What I've tried so far, without any luck:
- tried mimicking the browser's client app ADBricks UI requests, like [POST https://*.azuredatabricks.net/graphql/HistoryStatementPlanMetadata] with {operationName:"HistoryStatementPlanMetadata",variables:{metadataInput:{queryId="GUIDGUID-GUID-GUID-GUID-...bla-bla...}}query="query HistoryStatementPlanMetadata($metadataInput:...bla-bla...}
-- this never works reliably, returning 400 "Graph request not authentic" replies, regardless that a mimicked request (is identical to the client ADBricks app request) and is correctly authenticated & authorized within the same browser's session and the same browser's request's headers [Cookie:workspace-url=adb-***.azuredatabricks.net;DBAUTH=dbws.1.***]
- tried exploring [system.access.*/(audit,table_lineage)], [system.query.*] tables with a Notebook queries [SELECT ... JOIN ...]
What I read about, by it looks not tightly related to historical traces:
- one may try use "injected" Python code (spark.QueryExecutionListener + spark.DataFrame + observe()) as a wrapper/neighbor statement at runtime to a [CALL] statement and immediately after the [CALL ...] invocation - at the runtime, inspect (via DataFrame.explain() in "injected" code) an object model of returned [Plan*]s against expected metrics information;
-- this, additionally needs some "automatic code-injection/-binding mechanism" into each SQL-statement of code (generated by [CREATE PROCEDURE...]) of a Function residing in a UnityCatalog, which is injecting/binding right before been invoked by a Job TaskRun.
- one may also try, to dig into SparkUI logs, but for [Serverless] computes - SparkUI announced as not available.
So, in addition to the already known way, for retrieving an originating parent query ([CALL...] statement) execution history/metrics programmatically from "Query History" (via [system.*] tables, or, via API):
- could someone, please advise a way, how to get programmatically the child Subqueries/Sub-statements executions history/metrics records - in essential/native way using ADBricks released/public tools, or some alternatives?
A way, that can be "put into code" for periodical invocations in a Job TaskRun, for incrementally gathering Subqueries/sub-statements metrics records copies => into an organization's own, let's say [SQL/ETL-Log/History] flat table?
Thanks in advance