- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2022 01:38 PM
Currently, I am running a cluster that is set to terminate after 60 minutes of inactivity. However, in one of my notebooks, one of the cells is still running. How can I prevent this from happening, if want my notebook to run overnight without monitoring it and why is this happening?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2022 02:07 PM
@Kevin Kim As mentioned in the docs at https://docs.databricks.com/clusters/clusters-manage.html#automatic-termination-1 an autoterminating cluster may be terminated while it is running commands. From docs > The auto termination feature monitors only Spark jobs, not user-defined local processes. Therefore, if all Spark jobs have completed, a cluster may be terminated even if local processes are running.
If you want to keep the cluster active all the time, either you can disable "Automatic termination"(if allowed) or create a notebook with simple print or "%sql select 1" commands and schedule it to run at regular intervals(avoid scheduling forever) to keep the cluster active all the time.
Also, did you explore scheduling the notebook as a job? since auto termination is applicable only for all-purpose clusters.
***Note: Idle clusters continue to accumulate DBU and cloud instance charges during the inactivity period before termination.***
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2022 02:07 PM
@Kevin Kim As mentioned in the docs at https://docs.databricks.com/clusters/clusters-manage.html#automatic-termination-1 an autoterminating cluster may be terminated while it is running commands. From docs > The auto termination feature monitors only Spark jobs, not user-defined local processes. Therefore, if all Spark jobs have completed, a cluster may be terminated even if local processes are running.
If you want to keep the cluster active all the time, either you can disable "Automatic termination"(if allowed) or create a notebook with simple print or "%sql select 1" commands and schedule it to run at regular intervals(avoid scheduling forever) to keep the cluster active all the time.
Also, did you explore scheduling the notebook as a job? since auto termination is applicable only for all-purpose clusters.
***Note: Idle clusters continue to accumulate DBU and cloud instance charges during the inactivity period before termination.***
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-01-2024 12:00 AM - edited 10-01-2024 12:02 AM
Hello
I've been using this solution to keep my session alive when the process needs it. Put 10 minutes auto shutdown and some process uses a "keep alive" feature (or not).
We made a thread and it does a basic spark.sql("select 1").collect()
It worked well in 10.4 LTS run time. Now we had to move our databricks from France Central (Azure) to West Europe + upgrade to 12.2 LTS run time, and the query is not considered as an activity query.
Meaning that the query is sent a periodic even, but does not keep anymore the cluster alive. The cluster shutdown and when a tick comes to do the "select 1" it restart the cluster. But we lost the spark session and all our dataframes.
Any idea on how to fix this ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2022 04:18 PM
Can you just turn off the auto turn off?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2022 11:41 PM
indeed, uncheck the 'Terminate after x minutes of inactivity' flag
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-07-2024 12:31 PM
How do you uncheck this? Where can I find this button?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-10-2022 06:13 AM
If a cell is already running ( I assume it's a streaming operation), then I think it doesn't mean that the cluster is inactive. The cluster should be running if a cell is running on it.
On the other hand, if you want to keep running your clusters for a specific period of time (say 10pm to 8am), then you can schedule a cron job to invoke at 55th minute to run a basic command in a simple notebook on that very cluster. You can schedule the job execution using a cron expression.
Also, using cluster API, you can monitor if the cluster is running or not.

