Databricks Community

Ak_0926 · ‎03-22-2024

Let's say i have run a query and it showed me results. we can find the respective query execution plan on the UI. Is there any way we can get that execution plan through programmatically or through API?

Walter_C · ‎03-23-2024

You can obtain the query execution plan programmatically using the EXPLAIN statement in SQL. The EXPLAIN statement displays the execution plan that the database planner generates for the supplied statement. The execution plan shows how the table(s) referenced by the statement will be scanned — by plain sequential scan, index scan, etc. — and if multiple tables are referenced, what join algorithms will be used to bring together the required rows from each input table.

Here is an example of how you can use it:

# Spark SQL
query = "SELECT * FROM table"
plan = spark.sql(f"EXPLAIN {query}")
plan.show(truncate=False)

This will return a DataFrame with a single row and column that contains the execution plan as a string.

EXPLAIN command will only provide the logical and physical plans. It will not provide the runtime details like how much time each stage took, how much data was read, etc. For that level of detail, you would need to parse the Spark UI or logs.

Danny_Lee · ‎03-23-2024

The EXPLAIN docs show some extra functionality: https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-qry-explain.html

EXPLAIN [ EXTENDED | CODEGEN | COST | FORMATTED ] statement

--
The heart that breaks open can contain the whole universe. - Joanna Macy

Databricks Community

Can we get the actual query execution plan programmatically after a query is executed? Apart from UI

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences