cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

how to get schema and catalog name in sql warehouse query history API

ayush25091995
New Contributor III

Hi,

we are using SQL history query API by selecting catalog and schema name directly from SQL editor instead of passing it through query, we are not getting the schema name and catalog name in query text for that particular id.

So, how can we get the schema and catalog name for that types of queries because the queryid also does not match with the respective system generated query.

Thanks,

Ayush

 

1 ACCEPTED SOLUTION

Accepted Solutions

raphaelblg
Databricks Employee
Databricks Employee

Hi @ayush25091995 , check the "Query Profile" tab in the UI or check the Spark plans through the Spark UI.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

View solution in original post

6 REPLIES 6

raphaelblg
Databricks Employee
Databricks Employee

Hi @ayush25091995 , check the "Query Profile" tab in the UI or check the Spark plans through the Spark UI.

Best regards,

Raphael Balogo
Sr. Technical Solutions Engineer
Databricks

Hi @raphaelblg , we are using REST API for sql warehouse  to get the query history and query is being written in sql editor with selected catalog and schema name in drop down of editor, so for that particulars queries, we are just getting table name in response, not the schema and catalog name

mtajmouati
Contributor

Hi,

You can try : 

# Function to extract schema and catalog from query history
def extract_schema_catalog(query_history):
    for query in query_history.get("res", []):
        query_id = query.get("query_id")
        query_text = query.get("query_text")
        catalog = query.get("metadata", {}).get("details", {}).get("catalog")
        schema = query.get("metadata", {}).get("details", {}).get("schema")
        print(f"Query ID: {query_id}")
        print(f"Query Text: {query_text}")
        print(f"Catalog: {catalog}")
        print(f"Schema: {schema}")

 

 

 

Hi @mtajmouati, there is no field/key called metadata in API response for sql query history. 
https://docs.databricks.com/api/azure/workspace/queryhistory/list

mtajmouati
Contributor

True  ! try this :

import requests
import json

# Define your Databricks workspace URL and API token
databricks_instance = "https://<your-databricks-instance>"
api_token = "dapi<your-api-token>"

# Fetch SQL query history
def get_query_history():
    url = f"{databricks_instance}/api/2.0/sql/history/queries"
    headers = {
        "Authorization": f"Bearer {api_token}"
    }
    response = requests.get(url, headers=headers)
    response.raise_for_status()
    return response.json()

# Retrieve query history
query_history = get_query_history()

# Print retrieved query history for inspection
print(json.dumps(query_history, indent=4))

 

def extract_query_metadata(query_history):
    for query in query_history.get("res", []):
        query_id = query.get("query_id")
        query_text = query.get("query_text")
        user_name = query.get("user_name")
        # Fetch metadata from your Databricks environment or catalog
        # This is a placeholder function
        catalog, schema = fetch_catalog_and_schema(query_text, user_name)
        print(f"Query ID: {query_id}")
        print(f"Query Text: {query_text}")
        print(f"Catalog: {catalog}")
        print(f"Schema: {schema}")
        print("---")

# Placeholder function to fetch catalog and schema based on query text and user context
def fetch_catalog_and_schema(query_text, user_name):
    # Implement logic to fetch catalog and schema
    # For example, parse the query text to identify table names
    # and cross-reference with known metadata
    # Return default values if necessary
    catalog = "default_catalog"
    schema = "default_schema"
    return catalog, schema

# Extract and print metadata
extract_query_metadata(query_history)

 

@mtajmouati,  i got your point and even thought same but the issue is, lets say, i have got table name after parsing query text, and if i match with metadata to find catalog and schema, there could be possibility that same table might be present in multiple schema and catalog right?
How will i make sure the query that i run for my catalog and schema is which one? below image if you see, after selecting catalog and schema, i am just running simple query and in query history i am getting without schema and catalog for below query, so how will i make sure i ran query for this particular catalog and schema? what if multiple entries match with same table name right.

SQL_Editor:-

ayush25091995_0-1720038116765.png

Query history:-

ayush25091995_1-1720038378281.png

 

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group