cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

orphan queries in running state

meanwhilefurthe
New Contributor III

We have a job submitted through the Spark Connect API, and running on Serverless Compute.

The job got canceled twice and left a total of 14 queries orphan, they are in a weird state because the running time is not increasing, but they are there showing up as running.

There is no UI for serverless compute, the spark ui is not available either given that the compute is managed by databricks too, the api for cancelling the queries returns an empty response which apparently is supposed to, but the queries are still there in a running state.

Any way to cancel these queries? There is no cancel button in the UI either..

10 REPLIES 10

Khaja_Zaffer
Contributor

Hello @meanwhilefurthe 

can you run CANCEL QUERY 'your_query_id'

replace your query id there.

Hey @Khaja_Zaffer, appreciate your reply, but the cancel query does not work, it is a serverless compute, so if a new session is created, it can't communicate with the old one anymore.

Khaja_Zaffer
Contributor

You can configure a timeout for your Spark queries by setting the spark.databricks.queryWatchdog.timeoutInSeconds configuration property. This will automatically terminate any query that exceeds the specified execution time, preventing them from becoming long-running orphans.

We do have timeouts, and there also default timeouts too otherwise, the issue is not that the query is running for longer than that timeout, but that is in this weird state where it shows as running but it is not getting updated metrics of running time or anything like that

Khaja_Zaffer
Contributor

I think we need to check internals on this issue. 

better create a ticket with databricks. 

Please raise the ticket using this lik  https://help.databricks.com/s/contact-us?ReqType=training Please explain the issue clearly so that it will be easy for supoort team to help easily.

I already did and the support redirected me here. the ticket I opened is: 00699724

Khaja_Zaffer
Contributor

Just asking are you using Azure cloud?

nope, AWS

Khaja_Zaffer
Contributor

ALSO, did you make any recent code changes or network changes?

The only one was to increase the "spark.databricks.execution.timeout" because the query needed more than 2.5h unfortunately