cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to obtain a query profile programatically?

guizsantos
New Contributor II

Hi everyone! Does anyone know if there is a way to obtain the data used to create the graph showed in the "Query profile" section? Particularly, I am interested in the rows produced by the intermediary query operations. I can see there is "Download" button which effectively downloads a descriptive JSON file of the graph data, which is exactly what I want to obtain. 

However, performing this step manually is simply not possible. I've searched through the Databricks REST API docs but it seems the query profile is not provided in any endpoint.

1 ACCEPTED SOLUTION

Accepted Solutions

raphaelblg
Databricks Employee
Databricks Employee

Hello @guizsantos ,

The DBSQL query profile is generated from the spark plans and execution logs, spark plans can be gathered through the EXPLAIN SQL command. However, it's important to note that the full query profile as seen in the UI is not currently retrievable through the Databricks REST API.

If you're interested in having this feature available in Databricks, I encourage you to share your idea in the Databricks Ideas Portal.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

View solution in original post

2 REPLIES 2

raphaelblg
Databricks Employee
Databricks Employee

Hello @guizsantos ,

The DBSQL query profile is generated from the spark plans and execution logs, spark plans can be gathered through the EXPLAIN SQL command. However, it's important to note that the full query profile as seen in the UI is not currently retrievable through the Databricks REST API.

If you're interested in having this feature available in Databricks, I encourage you to share your idea in the Databricks Ideas Portal.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

guizsantos
New Contributor II

Hey @raphaelblg , thanks for you input!

I understand that some info may be obtained by the `EXPLAIN` command, however, the output is not very clear on its meaning and definetely does not provide what is most interesting to us, which is the rows processed/generated by the query intermediate operations. I tried going through the Spark execution logs as well but they are very scattered and I was not able to find a way to gather those programatically as well.

So, I will submit the request in the ideas portal, thanks for the reference.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group