11-22-2021 02:28 AM
I am using Databeicks Job Api 2.1 to trigger and run my jobs. "jobs/runs/submit" this API helps in starting the cluster, as well as create the job and run it. This API works great for normal jobs as it also cleans the cluster once job is finished successfully.
I know through my POC that in case when the jobs are not successful (Failed, Cancelled or Terminated), the cluster created by the API is maintained in the Job Clusters UI.
I want to know that for how long these Job Clusters are maintained by default? Or are they maintained unless until they are cleaned manually?
Thanks for the read and help on this.
11-23-2021 10:46 PM
The job clusters for finished or failed runs are maintained in Job Clusters UI. They are up to 30 recently terminated job clusters are retained in UI and others are terminated. The finished or canceled runs are also cleaned up automatically starting with the oldest terminated cluster first.
One more thing to note, these terminated clusters list maintained in the Datbricks UI is just for config reference or audit of recent runs. These clusters do not link to any cloud resources (VMs, IPs, or Disks). These resources do not incur any actual cost in the cloud.
11-22-2021 04:58 AM
This is really interesting question. I bet that better is to use pool for servers for that jobs tasks so we can see in pool how servers behave and also have settings for that. From my experience VM exists for at least few minutes after job failed (when pool have min idle 0). What technically is going there after job failed in that case should rather be answered by someone from inside of databricks @Kaniz Fatma
11-23-2021 10:46 PM
The job clusters for finished or failed runs are maintained in Job Clusters UI. They are up to 30 recently terminated job clusters are retained in UI and others are terminated. The finished or canceled runs are also cleaned up automatically starting with the oldest terminated cluster first.
One more thing to note, these terminated clusters list maintained in the Datbricks UI is just for config reference or audit of recent runs. These clusters do not link to any cloud resources (VMs, IPs, or Disks). These resources do not incur any actual cost in the cloud.
11-24-2021 08:10 PM
Hi @Gobinath Viswanathan , Thank you for your detailed answer.
Is this something you know from experience or do you have a source of this information from official documentations?
11-24-2021 08:14 PM
@Junee, Anytime! 🙂 It is crisply mentioned in the doc too. https://docs.databricks.com/clusters/index.html
11-24-2021 08:28 PM
Gracias @Gobinath Viswanathan 🙂
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group