cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Cluster termination issue

delta_bravo
New Contributor

I am using Databricks as a Community Edition user with a limited cluster (just 1 Driver: 15.3 GB Memory, 2 Cores, 1 DBU). I am trying to run some custom algorithms for continuous calculations and writing results to the delta table every 15 minutes along with notifying me by email using SMTP protocol.

The problem is I intend to do calculations let's say for a particular depth (imaging building a hierarchical-deterministic wallet that is represented as a tree) and those calculations may take a few hours or even up to one day. But for some reason, my cluster is being terminated after 1 hour of processing.

I was looking for a solution to similar issues and suggestions like making spark.sql("select 1") just to keep the cluster alive even if I do it as a daemon process never worked for me. Even as I mentioned before I do df.write(...) results to my table but it also doesn't keep the cluster alive enough time. 

So, I am wondering if is there any solution to my problem and if there is another way to keep the cluster alive or if it is just limited for Community Edition users to do processing on a cluster longer than 1 hour.

thanks in advance

2 REPLIES 2

chardv
New Contributor II

Hi @Retired_mod , we in the team are wondering what will happen if we put 0 minutes in the "Terminate after" settings of the all-purpose compute. Thanks!  

NandiniN
Databricks Employee
Databricks Employee

If you set the "Terminate after" setting to 0 minutes during the creation of an all-purpose compute, it means that the auto-termination feature will be turned off. This is because the "Terminate after" setting is used to specify an inactivity period in minutes after which you want the compute to terminate. If the difference between the current time and the last command run on the compute is more than the inactivity period specified, Databricks automatically terminates that compute. Therefore, by setting it to 0, you are essentially opting out of auto-termination. This means that the compute will continue to run until it is manually terminated, regardless of whether it is active or idle. Please note that idle compute continue to accumulate DBU and cloud instance charges during the inactivity period before termination

https://docs.databricks.com/en/compute/clusters-manage.html#configure-automatic-termination

I am not sure how if the community edition has any restriction on that though.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group