cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Cluster Reuse for delta live tables

John_BardessGro
New Contributor II

I have several delta live table notebooks that are tied to different delta live table jobs so that I can use multiple target schema names. I know it's possible to reuse a cluster for job segments but is it possible for these delta live table jobs (which are run in sequence) to reuse the cluster that was created by the first job. I'm running into quota issues in the customer azure environment and this would help a lot.

It seems like if I run an individual job (lets call it sliver) multiple times, it will pickup the cluster that it used before, but if I run another job (lets call it gold), it tries to start its own cluster even though they are configured with the same cluster configuration

3 REPLIES 3

Hubert-Dudek
Esteemed Contributor III

The same DLT job (workflow) will use the same cluster in development mode (shutdown in 2h) and new in production (shutdown 0). Although in JSON, you can manipulate that value:

{
  "configuration": {
    "pipelines.clusterShutdown.delay": "60s"
  }
}

You can manipulate Azure's quotas by using different instances, and also, for smaller streams; you can set workers to 0

{
  "clusters": [
    {
       "label": "default",
       "node_type_id": "Standard_D3_v2",
       "driver_node_type_id": "Standard_D3_v2",
       "num_workers": 0
    }
  ]
}

I hope that pools will be added to DLT and serverless options.

Thank you for you for sharing this @Hubert Dudek​. I will highly recommend you @John Fico​ to follow Hubert's recommendations. In case you would like to check our docs, please go here https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-configuration.html#cluster...

Kaniz
Community Manager
Community Manager

Hi @John Fico​ ​, We haven’t heard from you since the last response from @Hubert Dudek​ and @Jose Gonzalez​ , and I was checking back to see if you have a resolution yet.

If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.