cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

What happens to the clusters whose jobs are canceled or terminated due to failures? (Jobs triggered through Job API2.1 using runs/submit)

Junee
New Contributor III

I am using Databeicks Job Api 2.1 to trigger and run my jobs. "jobs/runs/submit" this API helps in starting the cluster, as well as create the job and run it. This API works great for normal jobs as it also cleans the cluster once job is finished successfully.

I know through my POC that in case when the jobs are not successful (Failed, Cancelled or Terminated), the cluster created by the API is maintained in the Job Clusters UI.

I want to know that for how long these Job Clusters are maintained by default? Or are they maintained unless until they are cleaned manually?

Thanks for the read and help on this.

1 ACCEPTED SOLUTION

Accepted Solutions

User16871418122
Contributor III

The job clusters for finished or failed runs are maintained in Job Clusters UI. They are up to 30 recently terminated job clusters are retained in UI and others are terminated. The finished or canceled runs are also cleaned up automatically starting with the oldest terminated cluster first.

One more thing to note, these terminated clusters list maintained in the Datbricks UI is just for config reference or audit of recent runs. These clusters do not link to any cloud resources (VMs, IPs, or Disks). These resources do not incur any actual cost in the cloud.

View solution in original post

5 REPLIES 5

Hubert-Dudek
Esteemed Contributor III

This is really interesting question. I bet that better is to use pool for servers for that jobs tasks so we can see in pool how servers behave and also have settings for that. From my experience VM exists for at least few minutes after job failed (when pool have min idle 0). What technically is going there after job failed in that case should rather be answered by someone from inside of databricks @Kaniz Fatmaโ€‹ 

User16871418122
Contributor III

The job clusters for finished or failed runs are maintained in Job Clusters UI. They are up to 30 recently terminated job clusters are retained in UI and others are terminated. The finished or canceled runs are also cleaned up automatically starting with the oldest terminated cluster first.

One more thing to note, these terminated clusters list maintained in the Datbricks UI is just for config reference or audit of recent runs. These clusters do not link to any cloud resources (VMs, IPs, or Disks). These resources do not incur any actual cost in the cloud.

Hi @Gobinath Viswanathanโ€‹ , Thank you for your detailed answer.

Is this something you know from experience or do you have a source of this information from official documentations?

User16871418122
Contributor III

@Junee, Anytime! ๐Ÿ™‚ It is crisply mentioned in the doc too. https://docs.databricks.com/clusters/index.html

Gracias @Gobinath Viswanathan  ๐Ÿ™‚

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group