Databricks Community

RobCox · ‎04-08-2025

I've been trying various solutions and perhaps maybe just thinking about this the wrong way.

We're migrating over from Synapse where we're used to have a defined set of DBX Cluster profiles to run our jobs against, these are all job clusters created via API so basically act as templates for us.

Now we're moving over to asset bundles, I'm trying to look at how we can have this "common" set of clusters for each of our DAB repos to use, to have some uniformity between them.

Something I was aiming for

- Define all the cluster types in a single file (i.e. clusters.yml)
- Allow per target -> task to override a default by providing simply the cluster name i.e. "Driver_Only_DSV3"

I have this working with on-demand clusters by leveraging existing_cluster_id and parameterising this, but to use job clusters it seems like you must register all job_clusters, and with 10-15 cluster variants defining those for each job for each repo is a lot of noise, and haven't actually got a solution working using this method.

saurabh18cs · ‎04-09-2025

hi, you can also parametrize your job clusters ??

job_clusters:

- job_cluster_key: Job_cluster

new_cluster:

spark_version: ${var.spark_version}

spark_conf: ${var.spark_configuration}

azure_attributes:

first_on_demand: 1

availability: ON_DEMAND_AZURE

spot_bid_max_price: -1

node_type_id: ${var.cluster_node_type_id}

spark_env_vars:

PYSPARK_PYTHON: /databricks/python3/bin/python3

LOG_LEVEL: DEBUG

BUNDLE_ROOT_DIR: ${workspace.file_path}

enable_elastic_disk: true

data_security_mode: SINGLE_USER

num_workers: ${var.cluster_worker_nodes}

instance_pool_id: ${var.executor_instance_pool_id}

driver_instance_pool_id: ${var.driver_instance_pool_id}

RobCox · ‎04-09-2025

This is one option yes, but ideally looking for to say

- Define cluster types once, usable by N packages that use databricks asset bundles
- Allow the bundle to just simply say run JOB.TASK step as Cluster_Type_1 for Production

Primary reasons for this is say we have common tagging strategies to apply on our clusters irrespective of asset bundle being deployed, and also common cluster configurations / spark conf setups.

A nice simple "middle ground" would be a common clusters.yml that can drop into any bundle if we decided to change the cluster_type_1 configuration and needed to replace it in 15 repos it would be nice and easy to change that one file.

Databricks Community

DAB - Common cluster configs possible?

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! December 12 – 21, 2025

PSA: Community Edition retires on January 1, 2026. Move to the Free Edition today to keep your work.

🎤 Call for Presentations: Data + AI Summit 2026 is Open!

Last Chance: Help Shape the 2026 Data + AI Summit | Win a Full Conference Pass

Celebrating Our First Brickster Champion: Louis Frolio