cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Retrieve table/view popularity

noorbasha534
Contributor

Dears

Is there a way to retrieve the popularity score of an unity catalog object? I looked at APIs documentation but couldn't find one that serves the need.

Appreciate any thoughts.

Br,

Noor.

2 ACCEPTED SOLUTIONS

Accepted Solutions

Hi @noorbasha534,

You can query system.access.audit table:

%sql
SELECT request_params.full_name_arg AS table_Name, COUNT(*) AS table_usage
FROM
  system.access.audit
WHERE
  action_name = 'getTable'
GROUP BY
  table_name
ORDER BY 
  table_usage DESC

The table is in preview not enabled by default. To enable you need to send API call:

# Import necessary libraries
import requests

# Define the necessary variables
workspace_url = "<workspace_url>"
metastore_id = "metastore_id"
schema_name = "access"
pat_token = "pat_token"

# Construct the URL for the API request
url = f"https://{workspace_url}/api/2.0/unity-catalog/metastores/{metastore_id}/systemschemas/{schema_name}"

# Set the headers for the HTTP request
headers = {
    "Authorization": f"Bearer {pat_token}"
}

# Make the PUT request to enable access
response = requests.put(url, headers=headers)

# Check the response status
if response.status_code == 200:
    print(f"Access successfully enabled for schema `{schema_name}`.")
elif response.status_code == 401:
    print("Authorization failed. Please check your PAT token.")
else:
    print(f"Failed to enable access: {response.status_code} - {response.text}")

 

View solution in original post

noorbasha534
Contributor

@filipniziol Hi Filip, Thank you. I did a quick test. In my environment, the table query (indirect) event is getting registered with "getTemporaryTableCredential". However, the view query (direct) event is with "getTable".

Thanks for your time again. Appreciate the support.

View solution in original post

7 REPLIES 7

szymon_dybczak
Esteemed Contributor III

Hi @noorbasha534 ,

You can take a look at table insights:

https://docs.databricks.com/en/discover/table-insights.html

Another option you can try is to use system tables and create based on them custom reports.

Hi @szymon_dybczak I like to retrieve programmatically what is displayed in the UI. If the system tables store the "direct" value of popularity score, please kindly guide me to the relevant table/column.

Appreciate. Thanks in advance.

Hi @noorbasha534,

You can query system.access.audit table:

%sql
SELECT request_params.full_name_arg AS table_Name, COUNT(*) AS table_usage
FROM
  system.access.audit
WHERE
  action_name = 'getTable'
GROUP BY
  table_name
ORDER BY 
  table_usage DESC

The table is in preview not enabled by default. To enable you need to send API call:

# Import necessary libraries
import requests

# Define the necessary variables
workspace_url = "<workspace_url>"
metastore_id = "metastore_id"
schema_name = "access"
pat_token = "pat_token"

# Construct the URL for the API request
url = f"https://{workspace_url}/api/2.0/unity-catalog/metastores/{metastore_id}/systemschemas/{schema_name}"

# Set the headers for the HTTP request
headers = {
    "Authorization": f"Bearer {pat_token}"
}

# Make the PUT request to enable access
response = requests.put(url, headers=headers)

# Check the response status
if response.status_code == 200:
    print(f"Access successfully enabled for schema `{schema_name}`.")
elif response.status_code == 401:
    print("Authorization failed. Please check your PAT token.")
else:
    print(f"Failed to enable access: {response.status_code} - {response.text}")

 

@filipniziol Hi Filip, ah ok. So, it is the usage from audit tables that is leveraged to derive popularity score. Thanks for this.

One final check please - we expose views to our users but I like to get popularity of tables and impose some data quality checks for the most popular tables, say, those that have been hit 100 times in a week. If an user is querying a view that has 2 underlying tables, the audit tables will record "getTable" event  for those 2 tables as UC query engine will try to parse the view query? Are you aware of this?

Thanks in advance.

filipniziol
Contributor III

Hi @noorbasha534 ,

yes, you are correct. When the query is hitting 2 tables, the audit table should record 2 "getTable" events.

Regards,
Filip

noorbasha534
Contributor

@filipniziol Hi Filip, Thank you. I did a quick test. In my environment, the table query (indirect) event is getting registered with "getTemporaryTableCredential". However, the view query (direct) event is with "getTable".

Thanks for your time again. Appreciate the support.

szymon_dybczak
Esteemed Contributor III

Hi @noorbasha534 ,

Please mark @filipniziol answer as a solution for this thread. This will help others with same kind of problem find proper solution faster.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group