cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

DLT Pipeline Validate will always spawn new cluster

T0M
New Contributor III

Hi all!

I've started learning DLT-Pipelines but I am struggling with the development of a pipeline.

As far as I understand it, once I click on โ€œValidateโ€ a cluster will spin-up and stay (by default for 2hours), if the pipeline is in โ€œDevelopmentโ€ mode.

So far so good. I can see the running cluster and it stays up even after the validation is over.

However, every time I make a minor change (like editing a Select-Statement), when I click on โ€œValidateโ€ again, it will terminate the running cluster and spin-up a new one. No changes in the settings, just a simple edit in a query.

Could someone tell me, what is the expected workflow when developing DLT pipelines?
Am I doing something wrong?
Should I consider something I am not aware of?

3 REPLIES 3

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @T0M,

It is expected that the cluster restarts with each validation to ensure that your changes are accurately reflected. 

  • In development mode, once you click "Validate," a cluster will spin up, and it stays active for up to two hours by default. This mode is optimized for quickly detecting and fixing errors by reusing clusters to avoid the overhead associated with frequent restarts.
     
  • However, if you make changes to the pipeline, and then click "Validate" again, the cluster will terminate and a new one will spin up to apply those changes. This is necessary to make sure the pipeline processes the updated code correctly

It is recommended to make larger batch changes and validate them in one go to minimize cluster restarts

 

T0M
New Contributor III

 

Many thanks for your help, @Alberto_Umana.

Do I understand it correctly that the cluster stays active after "Validate" to start the Pipeline faster when clicking on "Start"? When doing another round of "Validate", I will not profit from the active cluster, right?

T0M
New Contributor III

Well, turns out if I do not make any changes to the cluster settings when creating a new pipeline (i.e. keep default) it works as expected (every new "validate" skips the "waiting for resources"-step).

Initially, I reduced the number of workers to a minimum for the development. 

BTW: Working on the GCP version.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now