โ10-07-2024 07:13 PM
how to use rest api to find long running query in databricks from sql/queries/all
โ10-07-2024 09:10 PM
Hi @slakshmanan
To find long-running queries in Databricks using the REST API, specifically from the /sql/queries/all endpoint, you'll need to follow these steps:
Prerequisites:
Databricks Account with REST API enabled.
Access Token: You need a valid access token for authentication. You can generate it via Databricks UI under the "User Settings" page.
Workspace URL: Your Databricks workspace URL (e.g., https://<databricks-instance>.cloud.databricks.com).
Steps:
1. API Authentication
You can authenticate using your Bearer Token in the headers of the API request.
Authorization: Bearer <Access Token>
2. REST API Endpoint
The API endpoint for listing all SQL queries is:
GET /api/2.0/sql/history/queries
3. Query Parameters
page: Page number of the results.
page_size: The number of queries per page.
This will list the SQL query history from the most recent to the oldest.
4. Filter for Long-Running Queries
You'll need to programmatically analyze the query results to identify long-running ones. In the response, you'll get details like:
query_id: Unique identifier for the query.
start_time_ms: Query start time in epoch milliseconds.
end_time_ms: Query end time in epoch milliseconds.
duration_ms: Duration of the query in milliseconds.
query_text: The SQL query text.
You can use the duration_ms to filter out queries that ran for too long.
Example Response Snippet:
json
{
"res": [
{
"query_id": "abc123",
"query_text": "SELECT * FROM example_table",
"start_time_ms": 1696456800000,
"end_time_ms": 1696460400000,
"duration_ms": 3600000
},
...
]
}
You can calculate the duration of each query (duration_ms) and flag queries that exceed a certain threshold (e.g., 10 minutes = 600,000 ms).
5. Code Example (Python + requests):
python
import requests
# Replace with your Databricks instance and access token
databricks_instance = "<databricks-instance>"
access_token = "<your-access-token>"
headers = {
"Authorization": f"Bearer {access_token}"
}
# API URL to fetch SQL query history
url = f"https://{databricks_instance}.cloud.databricks.com/api/2.0/sql/history/queries"
# API request with optional filters (pagination can be added)
response = requests.get(url, headers=headers)
if response.status_code == 200:
queries = response.json().get("res", [])
# Filter for long-running queries (e.g., >10 minutes)
long_running_queries = [q for q in queries if q.get('duration_ms', 0) > 600000]
for query in long_running_queries:
print(f"Query ID: {query['query_id']}, Duration: {query['duration_ms']} ms")
else:
print(f"Error: {response.status_code} - {response.text}")
This script will fetch the query history and filter out the queries that have run for more than 10 minutes (600,000 milliseconds).
โ10-08-2024 10:13 AM
I would like to see if there are any queries running with run time more than 30 minutes or queries pulling more than 1 million
โ10-08-2024 07:38 AM
Thanks, i will try this and let you know
โ10-08-2024 07:45 AM
@Ajay-Pandey to find out if my databricks is using rest api, how do i check it
โ10-08-2024 10:39 AM
To check for API call logs in Databricks, you can follow these steps:
โ10-08-2024 02:40 PM
how to cancel or kill a long running sql query from databricks python notebook.I have a long running sql query in sql warehouse
โ10-08-2024 11:43 PM
You can run this query to get the long running queries and then kill the query you wanted to kill.
# Step 1: Get active queries
active_queries = spark.sql("SHOW PROCESSLIST")
active_queries.show(truncate=False)
# Step 2: Identify the query ID you want to kill
# (Assume you noted the ID as '12345')
# Step 3: Kill the query
spark.sql("KILL QUERY 12345")
3 weeks ago
@Ajay-Pandey - Hi .. I want to check the queries that are actively running and taking long time for execution.. please suggest how to check? Thanks.
2 weeks ago
@Ajay-Pandey Thanks. This API works fine. But it is showing only the current day queries. How can we get the all queries which is currently running.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group