<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Databricks SQL Warehouse Hung - Queries Stuck in Queued State &amp;amp; No Alerts Triggered in Administration &amp; Architecture</title>
    <link>https://community.databricks.com/t5/administration-architecture/databricks-sql-warehouse-hung-queries-stuck-in-queued-state-amp/m-p/108482#M2919</link>
    <description>&lt;P class=""&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/116595"&gt;@sdheepak&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;The first thing you need to identify is &lt;SPAN class=""&gt;&lt;STRONG&gt;the type of SQL Warehouse&lt;/STRONG&gt;&lt;/SPAN&gt; you are using in Databricks:&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;• &lt;/SPAN&gt;&lt;STRONG&gt;Is it Serverless?&amp;nbsp;&lt;/STRONG&gt;If so, it is fully managed by Databricks, and you &lt;SPAN class=""&gt;&lt;STRONG&gt;must contact Databricks support&lt;/STRONG&gt;&lt;/SPAN&gt; because &lt;SPAN class=""&gt;&lt;STRONG&gt;you won’t have access to logs&lt;/STRONG&gt;&lt;/SPAN&gt; in your cloud provider.&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;• &lt;/SPAN&gt;&lt;STRONG&gt;Is it Classic or Pro?&amp;nbsp;&lt;/STRONG&gt;In this case, you &lt;SPAN class=""&gt;&lt;STRONG&gt;may be able to check logs&lt;/STRONG&gt;&lt;/SPAN&gt; in the EC2 instances (AWS) or virtual machines (Azure/GCP) within your cloud provider.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;How can we monitor SQL Warehouse health in real-time?&amp;nbsp;&lt;/STRONG&gt;Yes, you can monitor the SQL Warehouse health by navigating to:&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;&lt;STRONG&gt;Compute &amp;gt; SQL Warehouses&lt;/STRONG&gt;&lt;/SPAN&gt; → Here, you can check:&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;&lt;STRONG&gt;Warehouse Type&lt;/STRONG&gt;&lt;/SPAN&gt; (Serverless, Classic, Pro)&lt;BR /&gt;&lt;STRONG&gt;Size &amp;amp; Active Status&lt;BR /&gt;&lt;/STRONG&gt;&lt;STRONG&gt;Autoscale settings&lt;BR /&gt;&lt;/STRONG&gt;&lt;STRONG&gt;Running Queries, Queued Queries, Query Peaks, and Completed Queries&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;For historical queries:&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;Go to &lt;/SPAN&gt;&lt;STRONG&gt;SQL &amp;gt; Query History .&amp;nbsp;&lt;/STRONG&gt;Filter by &lt;SPAN class=""&gt;&lt;STRONG&gt;cluster and date&lt;/STRONG&gt;&lt;/SPAN&gt; (up to &lt;SPAN class=""&gt;&lt;STRONG&gt;14 days max&lt;/STRONG&gt;&lt;/SPAN&gt; of history).&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Are there any best practices for debugging SQL Warehouse when it hangs?&amp;nbsp;&lt;/STRONG&gt;&lt;SPAN class=""&gt;&lt;STRONG&gt;If you are using Serverless&lt;/STRONG&gt;&lt;/SPAN&gt;, I strongly recommend switching to &lt;SPAN class=""&gt;&lt;STRONG&gt;Classic mode&lt;/STRONG&gt;&lt;/SPAN&gt;.&amp;nbsp;&lt;SPAN class=""&gt;It is &lt;/SPAN&gt;&lt;STRONG&gt;cheaper&lt;/STRONG&gt;&lt;SPAN class=""&gt;, allows &lt;/SPAN&gt;&lt;STRONG&gt;better fine-tuning of infrastructure&lt;/STRONG&gt;&lt;SPAN class=""&gt;, and &lt;/SPAN&gt;&lt;STRONG&gt;doesn’t autoscale as aggressively&lt;/STRONG&gt;&lt;SPAN class=""&gt;, meaning fewer &lt;/SPAN&gt;&lt;STRONG&gt;Databricks Units (DBUs)&lt;/STRONG&gt;&lt;SPAN class=""&gt; and lower costs.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Is there a way to enable logging or diagnostics when the warehouse becomes unresponsive?&amp;nbsp;&lt;/STRONG&gt;It depends on what kind of &lt;SPAN class=""&gt;&lt;STRONG&gt;“hung state”&lt;/STRONG&gt;&lt;/SPAN&gt; you are experiencing.&amp;nbsp;&lt;SPAN class=""&gt;If &lt;/SPAN&gt;&lt;STRONG&gt;queries appear to be “running” indefinitely&lt;/STRONG&gt;&lt;SPAN class=""&gt;, check &lt;/SPAN&gt;&lt;STRONG&gt;Query History&lt;/STRONG&gt;&lt;SPAN class=""&gt; to see if there are &lt;/SPAN&gt;&lt;STRONG&gt;errors related to queries, connections, or processing failures&lt;/STRONG&gt;&lt;SPAN class=""&gt;.&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;If &lt;/SPAN&gt;&lt;STRONG&gt;there are no visible logs or query failures&lt;/STRONG&gt;&lt;SPAN class=""&gt;, then &lt;/SPAN&gt;&lt;STRONG&gt;you may need Databricks support&lt;/STRONG&gt;&lt;SPAN class=""&gt; to investigate deeper.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Are there any settings in Databricks that can help us auto-recover from such failures?&amp;nbsp;&lt;/STRONG&gt;&lt;SPAN class=""&gt;If you are &lt;/SPAN&gt;&lt;STRONG&gt;running these queries from an external orchestrator&lt;/STRONG&gt;&lt;SPAN class=""&gt;, &lt;/SPAN&gt;&lt;STRONG&gt;Databricks does not provide built-in auto-recovery&lt;/STRONG&gt;&lt;SPAN class=""&gt; for SQL Warehouses.&lt;BR /&gt;&lt;/SPAN&gt;The best solution is to &lt;SPAN class=""&gt;&lt;STRONG&gt;implement a retry mechanism&lt;/STRONG&gt;&lt;/SPAN&gt; in your orchestrator/system, ensuring that queries automatically retry if no response is received.&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;• &lt;/SPAN&gt;&lt;STRONG&gt;Alternative Approach:&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;Instead of running queries externally, you can create a &lt;SPAN class=""&gt;&lt;STRONG&gt;Databricks Workflow&lt;/STRONG&gt;&lt;/SPAN&gt; with multiple tasks and &lt;SPAN class=""&gt;&lt;STRONG&gt;configure retry policies&lt;/STRONG&gt;&lt;/SPAN&gt; to reduce failures.&lt;BR /&gt;&lt;BR /&gt;Hope that helps &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 02 Feb 2025 23:37:20 GMT</pubDate>
    <dc:creator>Isi</dc:creator>
    <dc:date>2025-02-02T23:37:20Z</dc:date>
    <item>
      <title>Databricks SQL Warehouse Hung - Queries Stuck in Queued State &amp; No Alerts Triggered</title>
      <link>https://community.databricks.com/t5/administration-architecture/databricks-sql-warehouse-hung-queries-stuck-in-queued-state-amp/m-p/108451#M2917</link>
      <description>&lt;P&gt;We have been facing critical challenges with &lt;STRONG&gt;Databricks SQL Warehouse&lt;/STRONG&gt; for the last four weeks. We are using &lt;STRONG&gt;Databricks SQL Warehouse injection from IICS&lt;/STRONG&gt;, and we have observed the following issues:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;SQL Warehouse Going into a Hung State&lt;/STRONG&gt; – The SQL Warehouse becomes completely unresponsive.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;All Queries Stuck in Queued State&lt;/STRONG&gt; – None of the queries are processing, leading to severe workflow disruptions.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;No Alerts Triggered&lt;/STRONG&gt; – Since the SQL Warehouse is hung, &lt;STRONG&gt;we do not receive any alerts&lt;/STRONG&gt;, making it impossible to proactively respond.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;No Logs or Health Metrics Available&lt;/STRONG&gt; – We do not have visibility into logs or any other SQL Warehouse health monitoring to diagnose the issue.&lt;/LI&gt;&lt;/OL&gt;&lt;H3&gt;&lt;STRONG&gt;Questions &amp;amp; Help Needed:&lt;/STRONG&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;How can we monitor SQL Warehouse health in real-time?&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Are there any recommended best practices for debugging SQL Warehouse when it hangs?&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Is there a way to enable logging or diagnostics when the warehouse becomes unresponsive?&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Are there any settings in Databricks that can help us auto-recover from such failures?&lt;/STRONG&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;This issue is severely impacting our workloads, and any guidance or solutions would be greatly appreciated.&lt;/P&gt;</description>
      <pubDate>Sun, 02 Feb 2025 15:33:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/databricks-sql-warehouse-hung-queries-stuck-in-queued-state-amp/m-p/108451#M2917</guid>
      <dc:creator>sdheepak</dc:creator>
      <dc:date>2025-02-02T15:33:45Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks SQL Warehouse Hung - Queries Stuck in Queued State &amp; No Alerts Triggered</title>
      <link>https://community.databricks.com/t5/administration-architecture/databricks-sql-warehouse-hung-queries-stuck-in-queued-state-amp/m-p/108482#M2919</link>
      <description>&lt;P class=""&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/116595"&gt;@sdheepak&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;The first thing you need to identify is &lt;SPAN class=""&gt;&lt;STRONG&gt;the type of SQL Warehouse&lt;/STRONG&gt;&lt;/SPAN&gt; you are using in Databricks:&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;• &lt;/SPAN&gt;&lt;STRONG&gt;Is it Serverless?&amp;nbsp;&lt;/STRONG&gt;If so, it is fully managed by Databricks, and you &lt;SPAN class=""&gt;&lt;STRONG&gt;must contact Databricks support&lt;/STRONG&gt;&lt;/SPAN&gt; because &lt;SPAN class=""&gt;&lt;STRONG&gt;you won’t have access to logs&lt;/STRONG&gt;&lt;/SPAN&gt; in your cloud provider.&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;• &lt;/SPAN&gt;&lt;STRONG&gt;Is it Classic or Pro?&amp;nbsp;&lt;/STRONG&gt;In this case, you &lt;SPAN class=""&gt;&lt;STRONG&gt;may be able to check logs&lt;/STRONG&gt;&lt;/SPAN&gt; in the EC2 instances (AWS) or virtual machines (Azure/GCP) within your cloud provider.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;How can we monitor SQL Warehouse health in real-time?&amp;nbsp;&lt;/STRONG&gt;Yes, you can monitor the SQL Warehouse health by navigating to:&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;&lt;STRONG&gt;Compute &amp;gt; SQL Warehouses&lt;/STRONG&gt;&lt;/SPAN&gt; → Here, you can check:&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;&lt;STRONG&gt;Warehouse Type&lt;/STRONG&gt;&lt;/SPAN&gt; (Serverless, Classic, Pro)&lt;BR /&gt;&lt;STRONG&gt;Size &amp;amp; Active Status&lt;BR /&gt;&lt;/STRONG&gt;&lt;STRONG&gt;Autoscale settings&lt;BR /&gt;&lt;/STRONG&gt;&lt;STRONG&gt;Running Queries, Queued Queries, Query Peaks, and Completed Queries&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;For historical queries:&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;Go to &lt;/SPAN&gt;&lt;STRONG&gt;SQL &amp;gt; Query History .&amp;nbsp;&lt;/STRONG&gt;Filter by &lt;SPAN class=""&gt;&lt;STRONG&gt;cluster and date&lt;/STRONG&gt;&lt;/SPAN&gt; (up to &lt;SPAN class=""&gt;&lt;STRONG&gt;14 days max&lt;/STRONG&gt;&lt;/SPAN&gt; of history).&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Are there any best practices for debugging SQL Warehouse when it hangs?&amp;nbsp;&lt;/STRONG&gt;&lt;SPAN class=""&gt;&lt;STRONG&gt;If you are using Serverless&lt;/STRONG&gt;&lt;/SPAN&gt;, I strongly recommend switching to &lt;SPAN class=""&gt;&lt;STRONG&gt;Classic mode&lt;/STRONG&gt;&lt;/SPAN&gt;.&amp;nbsp;&lt;SPAN class=""&gt;It is &lt;/SPAN&gt;&lt;STRONG&gt;cheaper&lt;/STRONG&gt;&lt;SPAN class=""&gt;, allows &lt;/SPAN&gt;&lt;STRONG&gt;better fine-tuning of infrastructure&lt;/STRONG&gt;&lt;SPAN class=""&gt;, and &lt;/SPAN&gt;&lt;STRONG&gt;doesn’t autoscale as aggressively&lt;/STRONG&gt;&lt;SPAN class=""&gt;, meaning fewer &lt;/SPAN&gt;&lt;STRONG&gt;Databricks Units (DBUs)&lt;/STRONG&gt;&lt;SPAN class=""&gt; and lower costs.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Is there a way to enable logging or diagnostics when the warehouse becomes unresponsive?&amp;nbsp;&lt;/STRONG&gt;It depends on what kind of &lt;SPAN class=""&gt;&lt;STRONG&gt;“hung state”&lt;/STRONG&gt;&lt;/SPAN&gt; you are experiencing.&amp;nbsp;&lt;SPAN class=""&gt;If &lt;/SPAN&gt;&lt;STRONG&gt;queries appear to be “running” indefinitely&lt;/STRONG&gt;&lt;SPAN class=""&gt;, check &lt;/SPAN&gt;&lt;STRONG&gt;Query History&lt;/STRONG&gt;&lt;SPAN class=""&gt; to see if there are &lt;/SPAN&gt;&lt;STRONG&gt;errors related to queries, connections, or processing failures&lt;/STRONG&gt;&lt;SPAN class=""&gt;.&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;If &lt;/SPAN&gt;&lt;STRONG&gt;there are no visible logs or query failures&lt;/STRONG&gt;&lt;SPAN class=""&gt;, then &lt;/SPAN&gt;&lt;STRONG&gt;you may need Databricks support&lt;/STRONG&gt;&lt;SPAN class=""&gt; to investigate deeper.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;STRONG&gt;Are there any settings in Databricks that can help us auto-recover from such failures?&amp;nbsp;&lt;/STRONG&gt;&lt;SPAN class=""&gt;If you are &lt;/SPAN&gt;&lt;STRONG&gt;running these queries from an external orchestrator&lt;/STRONG&gt;&lt;SPAN class=""&gt;, &lt;/SPAN&gt;&lt;STRONG&gt;Databricks does not provide built-in auto-recovery&lt;/STRONG&gt;&lt;SPAN class=""&gt; for SQL Warehouses.&lt;BR /&gt;&lt;/SPAN&gt;The best solution is to &lt;SPAN class=""&gt;&lt;STRONG&gt;implement a retry mechanism&lt;/STRONG&gt;&lt;/SPAN&gt; in your orchestrator/system, ensuring that queries automatically retry if no response is received.&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;• &lt;/SPAN&gt;&lt;STRONG&gt;Alternative Approach:&lt;/STRONG&gt;&lt;/P&gt;&lt;P class=""&gt;Instead of running queries externally, you can create a &lt;SPAN class=""&gt;&lt;STRONG&gt;Databricks Workflow&lt;/STRONG&gt;&lt;/SPAN&gt; with multiple tasks and &lt;SPAN class=""&gt;&lt;STRONG&gt;configure retry policies&lt;/STRONG&gt;&lt;/SPAN&gt; to reduce failures.&lt;BR /&gt;&lt;BR /&gt;Hope that helps &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 02 Feb 2025 23:37:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/databricks-sql-warehouse-hung-queries-stuck-in-queued-state-amp/m-p/108482#M2919</guid>
      <dc:creator>Isi</dc:creator>
      <dc:date>2025-02-02T23:37:20Z</dc:date>
    </item>
  </channel>
</rss>

