Can we get the actual query execution plan programmatically after a query is executed? Apart from UI
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-22-2024 03:24 AM
Let's say i have run a query and it showed me results. we can find the respective query execution plan on the UI. Is there any way we can get that execution plan through programmatically or through API?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-23-2024 07:50 AM
You can obtain the query execution plan programmatically using the EXPLAIN statement in SQL. The EXPLAIN statement displays the execution plan that the database planner generates for the supplied statement. The execution plan shows how the table(s) referenced by the statement will be scanned — by plain sequential scan, index scan, etc. — and if multiple tables are referenced, what join algorithms will be used to bring together the required rows from each input table.
Here is an example of how you can use it:
# Spark SQL
query = "SELECT * FROM table"
plan = spark.sql(f"EXPLAIN {query}")
plan.show(truncate=False)
This will return a DataFrame with a single row and column that contains the execution plan as a string.
EXPLAIN command will only provide the logical and physical plans. It will not provide the runtime details like how much time each stage took, how much data was read, etc. For that level of detail, you would need to parse the Spark UI or logs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-23-2024 07:29 PM
The EXPLAIN docs show some extra functionality: https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-qry-explain.html
EXPLAIN [ EXTENDED | CODEGEN | COST | FORMATTED ] statement
The heart that breaks open can contain the whole universe. - Joanna Macy