Databricks Community

smurug · ‎08-01-2023

While scheduling the Databricks job using continuous mode - what will happen if the job is configured to run with Job cluster.

At the end of each run will the cluster be terminated and re-created again for the next run? The official documentation is not clear but it only mentioned that there will be a slight delay and it will be less than 60 seconds.

But a quick practical check for this scenario, points in the direction that the cluster is getting re-created, because a simple do nothing notebook is taking 2 minutes to completed and from the logs it looks like different clusters are used. Not conclusive though.

Appreciate any thoughts on the same - because logically the continuous option should re-use the cluster (to save on the start-up time), otherwise the value this option brings is limited.

Tharun-Kumar · ‎08-02-2023

@smurug

Job Cluster has been designed to be unique for each run of a job. So, each run of your job would run against a new job cluster.

If you want your job to run continuously without any delay and to re-use the cluster, I would recommend to use a dedicated interactive cluster. In this case, the cluster would be retained across job runs and your job runs would be instantly executed after the previous run is completed.

smurug · ‎08-02-2023

Thanks for the response - Yes we are doing this currently (using interactive cluster), however following are the pointers which are being considered for re-evaluating this approach and arrive at a possible alternative (if possible)

1) Cost difference between Interactive and Job cluster

2) In the Production environment, the following error is being received every now and then -

run failed with error message Context ExecutionContextId(1496834584910869936) is disconnected.. While this error can be received for multiple reasons, cluster resource constraints is one of the main reasons as per the understanding. Hence the thought process is to have individual Job clusters for different jobs, which can be scaled independently, hence this will result in making dedicated resources available for the Jobs rather than shared resources from interactive cluster across all jobs. However it might not be feasible to create many interactive cluster consider the costing, hence using Job cluster can offset some of this cost and help in reducing the overall cost.

Further, searching around the net - found this article https://medium.com/@24chynoweth/continuous-jobs-and-file-triggers-in-databricks-e7ba51a0c93a which mentioned about resources being re-used.

Also, the official documentation, https://docs.databricks.com/workflows/jobs/schedule-jobs.html - does not mention anything clearly about the re-use / termination, but mentions that there will be a slight delay which will be less than 60 seconds. Hence if the cluster needs to be re-created, I don't think it can guarantee only 60 seconds delay.

Jo5h · ‎09-29-2023

Hello @youssefmrini

So how is the DBU calculated? As the cluster is reused, the DBU should be calculated per hour on all the jobs run in an hour correct? Or will it be calculated based on each run?

I would like to know the cost calculation when running the continuous job

Databricks Community

Databricks Job scheduling - continuous mode

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples