cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Retrieve table/view popularity

noorbasha534
Contributor III

Dears

Is there a way to retrieve the popularity score of an unity catalog object? I looked at APIs documentation but couldn't find one that serves the need.

Appreciate any thoughts.

Br,

Noor.

2 ACCEPTED SOLUTIONS

Accepted Solutions

Hi @noorbasha534,

You can query system.access.audit table:

%sql
SELECT request_params.full_name_arg AS table_Name, COUNT(*) AS table_usage
FROM
  system.access.audit
WHERE
  action_name = 'getTable'
GROUP BY
  table_name
ORDER BY 
  table_usage DESC

The table is in preview not enabled by default. To enable you need to send API call:

# Import necessary libraries
import requests

# Define the necessary variables
workspace_url = "<workspace_url>"
metastore_id = "metastore_id"
schema_name = "access"
pat_token = "pat_token"

# Construct the URL for the API request
url = f"https://{workspace_url}/api/2.0/unity-catalog/metastores/{metastore_id}/systemschemas/{schema_name}"

# Set the headers for the HTTP request
headers = {
    "Authorization": f"Bearer {pat_token}"
}

# Make the PUT request to enable access
response = requests.put(url, headers=headers)

# Check the response status
if response.status_code == 200:
    print(f"Access successfully enabled for schema `{schema_name}`.")
elif response.status_code == 401:
    print("Authorization failed. Please check your PAT token.")
else:
    print(f"Failed to enable access: {response.status_code} - {response.text}")

 

View solution in original post

noorbasha534
Contributor III

@filipniziol Hi Filip, Thank you. I did a quick test. In my environment, the table query (indirect) event is getting registered with "getTemporaryTableCredential". However, the view query (direct) event is with "getTable".

Thanks for your time again. Appreciate the support.

View solution in original post

7 REPLIES 7

szymon_dybczak
Esteemed Contributor III

Hi @noorbasha534 ,

You can take a look at table insights:

https://docs.databricks.com/en/discover/table-insights.html

Another option you can try is to use system tables and create based on them custom reports.

Hi @szymon_dybczak I like to retrieve programmatically what is displayed in the UI. If the system tables store the "direct" value of popularity score, please kindly guide me to the relevant table/column.

Appreciate. Thanks in advance.

Hi @noorbasha534,

You can query system.access.audit table:

%sql
SELECT request_params.full_name_arg AS table_Name, COUNT(*) AS table_usage
FROM
  system.access.audit
WHERE
  action_name = 'getTable'
GROUP BY
  table_name
ORDER BY 
  table_usage DESC

The table is in preview not enabled by default. To enable you need to send API call:

# Import necessary libraries
import requests

# Define the necessary variables
workspace_url = "<workspace_url>"
metastore_id = "metastore_id"
schema_name = "access"
pat_token = "pat_token"

# Construct the URL for the API request
url = f"https://{workspace_url}/api/2.0/unity-catalog/metastores/{metastore_id}/systemschemas/{schema_name}"

# Set the headers for the HTTP request
headers = {
    "Authorization": f"Bearer {pat_token}"
}

# Make the PUT request to enable access
response = requests.put(url, headers=headers)

# Check the response status
if response.status_code == 200:
    print(f"Access successfully enabled for schema `{schema_name}`.")
elif response.status_code == 401:
    print("Authorization failed. Please check your PAT token.")
else:
    print(f"Failed to enable access: {response.status_code} - {response.text}")

 

@filipniziol Hi Filip, ah ok. So, it is the usage from audit tables that is leveraged to derive popularity score. Thanks for this.

One final check please - we expose views to our users but I like to get popularity of tables and impose some data quality checks for the most popular tables, say, those that have been hit 100 times in a week. If an user is querying a view that has 2 underlying tables, the audit tables will record "getTable" event  for those 2 tables as UC query engine will try to parse the view query? Are you aware of this?

Thanks in advance.

filipniziol
Esteemed Contributor

Hi @noorbasha534 ,

yes, you are correct. When the query is hitting 2 tables, the audit table should record 2 "getTable" events.

Regards,
Filip

noorbasha534
Contributor III

@filipniziol Hi Filip, Thank you. I did a quick test. In my environment, the table query (indirect) event is getting registered with "getTemporaryTableCredential". However, the view query (direct) event is with "getTable".

Thanks for your time again. Appreciate the support.

szymon_dybczak
Esteemed Contributor III

Hi @noorbasha534 ,

Please mark @filipniziol answer as a solution for this thread. This will help others with same kind of problem find proper solution faster.