Long story short, I'm not sure if this is an already known problem, but the Auto Stop feature on SQL Warehouses after minutes of inactivity is not working properly.
We started using SQL Warehouses more aggressively this December when we scaled up one of our operations (this process uses a GoLang connector) and noticed it was getting pretty expensive.
After some investigation, we found this behavior:
Even though queries were finishing running and there was no activity in the cluster, the cluster down did not go down, as if it was still running.
To make sure it was not a problem with the visualization, I've checked warehouse events on the system.compute.warehouse_events table, and it was, in fact, not turning the warehouse off.
In one particular example, this is the log for the warehouse:
STARTING 0 2024-12-25T03:02:46.198+00:00
STOPPING 1 2024-12-25T04:07:09.253+00:00
And these were the only queries executed during that period (start time and end time):
2024-12-25T03:15:44.147+00:00 2024-12-25T03:15:49.266+00:00
2024-12-25T03:56:27.947+00:00 2024-12-25T03:56:28.063+00:00
This warehouse is configured to auto-stop after 1 minute of inactivity, which did NOT happen. In order to test the cost impact of this, I've designed a quick code using Serverless and AWS Lambda to force warehouse stopping when there are not queries running/queued.
These are the results in terms of warehouse costs:
Basically a drop from $40/day to $10/day less.
I'm not sure what is causing the warehouse to stay "on", but I'm guessing it is interpreting JDBC connections as "active sessions" even though they're not running any queries - anyway, this is a serious problem in my opinion.
For those of you that want to copy the approach, the code can be found here: https://github.com/lmachado-sousa/databricks-sql-warehouse-terminator