cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Auto termination for clusters, jobs, and delta live tables does not terminate clusters on GCP.

638555
New Contributor III

Hello,

I am new to Databricks, and I have been trying to understand how auto termination works, but I am unsure if it is a problem based on my configuration or something else. The problem is the same in all cases, the cluster on GCP created by Databricks does not auto-terminate, but on the Databricks side, it looks different in each case.

1. In the case of clusters created through the compute interface, I have a single-node cluster (tried multi-node too) that is set to terminate after 2 hours. I spin it up and attach some notebook or a job to it. I wait for the job to finish and after that, I let it idle for more than 2 hours. Although the cluster on Databricks shows that it is terminated, when I go to GCP I still have a rogue cluster running that was created by Databricks. I have no pools, policy, or anything configured and Databricks does not show anything running in All-purpose compute or job compute.

2. In the case of Jobs the behavior is the same as above if I set the cluster to run on my running All-Purpose cluster.

3. In the case of delta tables, a Job compute is automatically created, and after the delta tables operation finishes, I let it idle for more than 2 hours (Development environment) and the Job cluster is still running. In this case, I can see it running on Databricks and GCP as well. I tried setting pipelines.clusterShutdown.delay too, but it has no effect.

In both cases, the cluster is running until I delete it manually from GCP. How can I ensure that my cluster gets shut down on GCP properly so I don't get charged?

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions

LandanG
Databricks Employee
Databricks Employee

Hi @Tilemachos Charalampousโ€‹ ,

The compute resources in your GCP account might not be the Spark clusters, rather the GKE cluster that Databricks spins up for the Databricks architecture in your account.

The note in the blue highlight in docs here https://docs.gcp.databricks.com/administration-guide/account-settings-gcp/workspaces.html#create-and... should go into it in more detail.

If no clusters are running but you still see a Databricks-created GKE cluster, it would most likely be that.

View solution in original post

3 REPLIES 3

638555
New Contributor III

Digging more into this, I realized that even if I terminate the cluster on GCP, it gets respawned shortly after automatically.

All clusters on Databricks are terminated, and no job clusters or pools appear either.

So I have a rogue GCP Databricks-created cluster running constantly.

LandanG
Databricks Employee
Databricks Employee

Hi @Tilemachos Charalampousโ€‹ ,

The compute resources in your GCP account might not be the Spark clusters, rather the GKE cluster that Databricks spins up for the Databricks architecture in your account.

The note in the blue highlight in docs here https://docs.gcp.databricks.com/administration-guide/account-settings-gcp/workspaces.html#create-and... should go into it in more detail.

If no clusters are running but you still see a Databricks-created GKE cluster, it would most likely be that.

638555
New Contributor III

Hi @Landan Georgeโ€‹ thanks for the answer. This looks correct. I probably missed it while going over the documentation. Thanks for helping.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group