cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to get usage statistics from Databricks or SQL Databricks?

Mado
Valued Contributor II

Hi,

I am looking for a way to get usage statistics from Databricks (Data Science & Engineering and SQL persona).

For example:

  • I created a table. I want to know how many times a specific user queried that table.
  • How many times a pipeline was triggered?
  • How long it took to run a DLT pipeline?

Is there any way to get usage statistics?

1 ACCEPTED SOLUTION

Accepted Solutions

youssefmrini
Databricks Employee
Databricks Employee

You can get those type of information by activating verbose audit logs. https://docs.databricks.com/administration-guide/account-settings/audit-logs.html

It contains a lot important metrics that you can leverage to build dashboards.

View solution in original post

9 REPLIES 9

Anonymous
Not applicable

The Overwatch library may help. Unity Catalog also has some auditing to see who has accessed a table. For DLT, there are logs written to storage that you can read as a dataframe and parse for things like start and stop times and count how many starts to determine how often a pipeline is triggered.

youssefmrini
Databricks Employee
Databricks Employee

You can get those type of information by activating verbose audit logs. https://docs.databricks.com/administration-guide/account-settings/audit-logs.html

It contains a lot important metrics that you can leverage to build dashboards.

Afeez_Raheem
New Contributor III

@Mohammad Saber​ , I hope that you are well. I have the same request from my manager. I was wondering if you found a way around this request already. We could work together, if you have not found the solution just yet

Please let's connect 🙂

Mado
Valued Contributor II

@Owo Akilo​ 

What I have found so far.

If Workspace is enabled for Unity Catalog:

I set up Azure Log Analytics following the documentation. Now, I get the logs in Azure Log Analytics Workspace.

image 

There is a table named "DatabricksUnityCatalog" where I can find the action name "getTable". Here, I can see table names. But, for some reasons that I don't understand, Databricks doesn't save all table usage information in "DatabricksUnityCatalog".

I queried tables for a few days but I only see "getTable" action for one of the days.

If Workspace is not enabled for Unity Catalog:

What I found is:

1) Using cluster logs for queries run in a notebook (Ref).

After enabling cluster logs, I can see table names in a text file "log4j-active" but this file changes whenever I start the cluster. Seems that this file is archived as gz files but I cannot find table names in the archived files for most queries I had in the notebook.

2) Query history for queries run in the SQL persona (Ref).

We can see the query history in the Databricks user interface but I am looking for a way to send it to the Power BI, and extract table names.

I don't understand how API works and how to use API Specification file (Ref)

Afeez_Raheem
New Contributor III

@Mohammad Saber​ Many thanks for sharing your learning thus far! I appreciate it

If you wouldn't mind, we could both connect on a Teams call to look into this more intimately. I am pretty new to learning the tool

Mado
Valued Contributor II

@Owo Akilo​ 

All right, you can see my email in my profile.

Afeez_Raheem
New Contributor III

@Mohammad Saber​ thank you. I'd be sending you an email shortly

Afeez_Raheem
New Contributor III

@Mohammad Saber​ , sorry it appears you don't have your email there. Would you help confirm, please?

twole
Databricks Employee
Databricks Employee

You can use System Tables, now available in Unity Catalog metastore, to create the views you described. https://docs.databricks.com/en/admin/system-tables/index.html

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group