cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks: Report on SQL queries that are being executed

enichante
New Contributor

We have a SQL workspace with a cluster running that services a number of self service reports against a range of datasets. We want to be able to analyse and report on the queries our self service users are executing so we can get better visibility of who is using the data platform, and what/how the tables are being used. Ideally this would be using databricks SQL workspace to do this reporting rather than using another tool.

All this information is available in the UI in the Query history, but this is not in a form we can easily analyse or create graphs against

We know there is an API to pull the query history from the UI, however it does seem convoluted to query the API to fetch data about our cluster so we can ingest into our cluster so we can query it

What is the best way to get query history information information into a hive table so we can query, analyse and graph it?

1 ACCEPTED SOLUTION

Accepted Solutions

BilalAslamDbrx
Honored Contributor III
Honored Contributor III

@Werner Stinckensโ€‹ is right, the API is the way to go -- for now! We want to make this a better experience for you e.g. giving you a system table you can query directly without having to extract the data with an API and re-ingest it.

View solution in original post

4 REPLIES 4

-werners-
Esteemed Contributor III

The API is the way to go.

Chris_Grabiel
New Contributor III

Agree with @Werner Stinckensโ€‹ . We built a lake pipeline to feed that data via the API into lake storage (so we could keep more query history and combine that history "across" workspaces.

I've been filling warehouses and lakes for a little over a decade, and as such I've held all types of roles along my data engineering "journey." Have a question? Just ask!

BilalAslamDbrx
Honored Contributor III
Honored Contributor III

@Werner Stinckensโ€‹ is right, the API is the way to go -- for now! We want to make this a better experience for you e.g. giving you a system table you can query directly without having to extract the data with an API and re-ingest it.

Anonymous
Not applicable

Looks like the people have spoken: API is your best option! (thanks @Werner Stinckensโ€‹  @Chris Grabielโ€‹  and @Bilal Aslamโ€‹ !)

@eni chanteโ€‹ Let us know if you have questions about the API! If not, please mark one of the replies above as the "best answer"! That way we know the case is closed.

.....but also we would love to know what creative solutions you came up with via our API. Feel free to reply below, share the knowledge! Talk soon.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group