cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to obtain a query profile programatically?

guizsantos
New Contributor II

Hi everyone! Does anyone know if there is a way to obtain the data used to create the graph showed in the "Query profile" section? Particularly, I am interested in the rows produced by the intermediary query operations. I can see there is "Download" button which effectively downloads a descriptive JSON file of the graph data, which is exactly what I want to obtain. 

However, performing this step manually is simply not possible. I've searched through the Databricks REST API docs but it seems the query profile is not provided in any endpoint.

1 ACCEPTED SOLUTION

Accepted Solutions

raphaelblg
Honored Contributor
Honored Contributor

Hello @guizsantos ,

The DBSQL query profile is generated from the spark plans and execution logs, spark plans can be gathered through the EXPLAIN SQL command. However, it's important to note that the full query profile as seen in the UI is not currently retrievable through the Databricks REST API.

If you're interested in having this feature available in Databricks, I encourage you to share your idea in the Databricks Ideas Portal.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

View solution in original post

2 REPLIES 2

raphaelblg
Honored Contributor
Honored Contributor

Hello @guizsantos ,

The DBSQL query profile is generated from the spark plans and execution logs, spark plans can be gathered through the EXPLAIN SQL command. However, it's important to note that the full query profile as seen in the UI is not currently retrievable through the Databricks REST API.

If you're interested in having this feature available in Databricks, I encourage you to share your idea in the Databricks Ideas Portal.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

guizsantos
New Contributor II

Hey @raphaelblg , thanks for you input!

I understand that some info may be obtained by the `EXPLAIN` command, however, the output is not very clear on its meaning and definetely does not provide what is most interesting to us, which is the rows processed/generated by the query intermediate operations. I tried going through the Spark execution logs as well but they are very scattered and I was not able to find a way to gather those programatically as well.

So, I will submit the request in the ideas portal, thanks for the reference.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!