Tuesday
Hi,
we are using SQL history query API by selecting catalog and schema name directly from SQL editor instead of passing it through query, we are not getting the schema name and catalog name in query text for that particular id.
So, how can we get the schema and catalog name for that types of queries because the queryid also does not match with the respective system generated query.
Thanks,
Ayush
Tuesday
Hi @ayush25091995 , check the "Query Profile" tab in the UI or check the Spark plans through the Spark UI.
Tuesday
Hi @raphaelblg , we are using REST API for sql warehouse to get the query history and query is being written in sql editor with selected catalog and schema name in drop down of editor, so for that particulars queries, we are just getting table name in response, not the schema and catalog name
Tuesday
Hi,
You can try :
# Function to extract schema and catalog from query history
def extract_schema_catalog(query_history):
for query in query_history.get("res", []):
query_id = query.get("query_id")
query_text = query.get("query_text")
catalog = query.get("metadata", {}).get("details", {}).get("catalog")
schema = query.get("metadata", {}).get("details", {}).get("schema")
print(f"Query ID: {query_id}")
print(f"Query Text: {query_text}")
print(f"Catalog: {catalog}")
print(f"Schema: {schema}")
Wednesday - last edited Wednesday
Hi @mtajmouati, there is no field/key called metadata in API response for sql query history.
https://docs.databricks.com/api/azure/workspace/queryhistory/list
Wednesday
True ! try this :
import requests
import json
# Define your Databricks workspace URL and API token
databricks_instance = "https://<your-databricks-instance>"
api_token = "dapi<your-api-token>"
# Fetch SQL query history
def get_query_history():
url = f"{databricks_instance}/api/2.0/sql/history/queries"
headers = {
"Authorization": f"Bearer {api_token}"
}
response = requests.get(url, headers=headers)
response.raise_for_status()
return response.json()
# Retrieve query history
query_history = get_query_history()
# Print retrieved query history for inspection
print(json.dumps(query_history, indent=4))
def extract_query_metadata(query_history):
for query in query_history.get("res", []):
query_id = query.get("query_id")
query_text = query.get("query_text")
user_name = query.get("user_name")
# Fetch metadata from your Databricks environment or catalog
# This is a placeholder function
catalog, schema = fetch_catalog_and_schema(query_text, user_name)
print(f"Query ID: {query_id}")
print(f"Query Text: {query_text}")
print(f"Catalog: {catalog}")
print(f"Schema: {schema}")
print("---")
# Placeholder function to fetch catalog and schema based on query text and user context
def fetch_catalog_and_schema(query_text, user_name):
# Implement logic to fetch catalog and schema
# For example, parse the query text to identify table names
# and cross-reference with known metadata
# Return default values if necessary
catalog = "default_catalog"
schema = "default_schema"
return catalog, schema
# Extract and print metadata
extract_query_metadata(query_history)
Wednesday
@mtajmouati, i got your point and even thought same but the issue is, lets say, i have got table name after parsing query text, and if i match with metadata to find catalog and schema, there could be possibility that same table might be present in multiple schema and catalog right?
How will i make sure the query that i run for my catalog and schema is which one? below image if you see, after selecting catalog and schema, i am just running simple query and in query history i am getting without schema and catalog for below query, so how will i make sure i ran query for this particular catalog and schema? what if multiple entries match with same table name right.
SQL_Editor:-
Query history:-
Excited to expand your horizons with us? Click here to Register and begin your journey to success!
Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!