cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks SQL Warehouse Querys went to orphan state

ChsAIkrishna
Contributor

We're experiencing an issue with our Databricks dbt workflow and workflow job is using the SQL warehouse L size cluster that's been working smoothly for the past couple of weeks. However, today we've noticed that at a specific time, all queries are getting stuck in an 'orphan' state and none are being executed. and the query's is not having any abnormality those are very lite sql's. after restarting the SQL Warehouse jobs process continues. 

is any permanent fix for this?

 

11 REPLIES 11

Walter_C
Databricks Employee
Databricks Employee

Here are some steps and considerations to help troubleshoot and resolve this issue:

  1. Check SQL Warehouse Sizing and Scaling:

    • Ensure that the SQL warehouse is appropriately sized for your workload. According to the documentation, SQL warehouses use dynamic concurrency to handle varying demands. If the number of concurrent queries or the complexity of the queries has increased, it might be necessary to adjust the cluster size or the number of clusters.
    • Refer to the SQL warehouse sizing, scaling, and queuing behavior documentation to understand how to optimize the cluster size and manage concurrent loads effectively.
  2. Monitor Query Performance:

    • Use the monitoring tools provided by Databricks to evaluate the performance of your SQL warehouse. Check the peak query count and the number of queries in the queue. If the peak queued queries are commonly above one, consider adding more clusters.
    • Review the query history and query profiles to identify any patterns or specific queries that might be causing the issue.

@Walter_C  No luck, after the sql warehouse reboot it is coming back, is any remedy apart from the query's killing ? 

ChsAIkrishna
Contributor

this is happening weekly once or twice and suddenly all the queries went to hang state. and observed all the query's are started on almost same time 1-2 mins different, max limit is 10 concurrent due to this other jobs are not moving.

ChsAIkrishna
Contributor

verified the sql monitor no peak queued queries

data volumes are same and some time even one row sql also went to hang state

verified the history and query profiles same sql is working after the reboot.

Walter_C
Databricks Employee
Databricks Employee

You have mentioned that this was working properly in the past, does this warehouses have autoscaling enabled?

"working properly in the past " its a diffrent issue that is connection failure added the profile.yml with your suggests it went well. 

this is new issue : query's are going to orphan state (hang not moving) on sql warehouse 

Walter_C
Databricks Employee
Databricks Employee

Do you have capability to open support case so we can look at your workspace?

yes we have a open ticket with MS EXT CHAT 2412190030007696

Walter_C
Databricks Employee
Databricks Employee

Thanks, MSFT might reach out to us and we will provide assistance

ChsAIkrishna
Contributor

Databricks team resolved and applied a patch (spark config on sql warehouse).

Walter_C
Databricks Employee
Databricks Employee

Great to hear your issue got resolved