Databricks Community

John_BardessGro · ‎10-21-2022

I have several delta live table notebooks that are tied to different delta live table jobs so that I can use multiple target schema names. I know it's possible to reuse a cluster for job segments but is it possible for these delta live table jobs (which are run in sequence) to reuse the cluster that was created by the first job. I'm running into quota issues in the customer azure environment and this would help a lot.

It seems like if I run an individual job (lets call it sliver) multiple times, it will pickup the cluster that it used before, but if I run another job (lets call it gold), it tries to start its own cluster even though they are configured with the same cluster configuration

Hubert-Dudek · ‎10-23-2022

The same DLT job (workflow) will use the same cluster in development mode (shutdown in 2h) and new in production (shutdown 0). Although in JSON, you can manipulate that value:

{
  "configuration": {
    "pipelines.clusterShutdown.delay": "60s"
  }
}

You can manipulate Azure's quotas by using different instances, and also, for smaller streams; you can set workers to 0

{
  "clusters": [
    {
       "label": "default",
       "node_type_id": "Standard_D3_v2",
       "driver_node_type_id": "Standard_D3_v2",
       "num_workers": 0
    }
  ]
}

I hope that pools will be added to DLT and serverless options.

jose_gonzalez · ‎10-24-2022

Thank you for you for sharing this @Hubert Dudek. I will highly recommend you @John Fico to follow Hubert's recommendations. In case you would like to check our docs, please go here https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-configuration.html#cluster...

Databricks Community

Cluster Reuse for delta live tables

Join Us as a Local Community Builder!

Announcing Backfill Runs in Lakeflow Jobs for Higher Quality Downstream Data

🚀 New: Databricks Interactive Architecture Design Workshops

Introducing Community Pulse — Your Weekly Databricks Roundup!

Solution Accelerator Series | #5 - Automating Product Review Summarization with LLMs

Databricks DevConnect I Washington D.C.