Databricks Community

umahesb3 · ‎02-04-2025

Facing issues databricks asset bundle, All jobs are getting Deployed into specified targets Instead of defined target following was files i am using resourser yaml and databricks yml file , i am using Databricks CLI v0.240.0 , i am using databricks bundle init default-python template, Can you please help to resolve this issue. Which is show stopper for my use cases.

example 1:

#asset_bundles_job1.yaml

resources:

jobs:

asset_bundles_job1:

description: >+

Extracts Data form Apis.

health:

rules:

- metric: RUN_DURATION_SECONDS

op: GREATER_THAN

value: 3600

schedule:

quartz_cron_expression: 0 0/15 * * * ?

timezone_id: UTC

pause_status: ${var.job_status}

max_concurrent_runs: 1

tasks:

- task_key: task1

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script1.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

tasks:

- task_key: task2

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script2.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

job_clusters:

- job_cluster_key: '${var.job_cluster_name}'

new_cluster:

spark_version: 15.4.x-scala2.12

spark_conf:

spark.databricks.repl.allowedLanguages: 'sql,python,r,scala'

spark.databricks.delta.retentionDurationCheck.enabled: false

spark.databricks.hive.metastore.glueCatalog.enabled: true

spark.hadoop.fs.s3a.acl.default: BucketOwnerFullControl

spark.hadoop.hive.metastore.glue.catalogid: '${var.glue_catalog}'

aws_attributes:

first_on_demand: 1

availability: SPOT_WITH_FALLBACK

zone_id: auto

instance_profile_arn: '${var.instance_profilearn}'

spot_bid_price_percent: 100

ebs_volume_count: 0

node_type_id: '${var.node_type}'

driver_node_type_id: '${var.node_type}'

spark_env_vars:

PYSPARK_PYTHON: /databricks/python3/bin/python3

enable_elastic_disk: true

data_security_mode: NONE

runtime_engine: PHOTON

autoscale:

min_workers: 1

max_workers: '${var.max_workers_instance}'

queue:

enabled: false

#asset_bundles_job2.yaml

resources:

jobs:

asset_bundles_job2:

description: >+

Extracts Data form Apis.

health:

rules:

- metric: RUN_DURATION_SECONDS

op: GREATER_THAN

value: 3600

schedule:

quartz_cron_expression: 0 0/15 * * * ?

timezone_id: UTC

pause_status: ${var.job_status}

max_concurrent_runs: 1

tasks:

- task_key: task1

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script1.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

tasks:

- task_key: task2

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script2.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

job_clusters:

- job_cluster_key: '${var.job_cluster_name}'

new_cluster:

spark_version: 15.4.x-scala2.12

spark_conf:

spark.databricks.repl.allowedLanguages: 'sql,python,r,scala'

spark.databricks.delta.retentionDurationCheck.enabled: false

spark.databricks.hive.metastore.glueCatalog.enabled: true

spark.hadoop.fs.s3a.acl.default: BucketOwnerFullControl

spark.hadoop.hive.metastore.glue.catalogid: '${var.glue_catalog}'

aws_attributes:

first_on_demand: 1

availability: SPOT_WITH_FALLBACK

zone_id: auto

instance_profile_arn: '${var.instance_profilearn}'

spot_bid_price_percent: 100

ebs_volume_count: 0

node_type_id: '${var.node_type}'

driver_node_type_id: '${var.node_type}'

spark_env_vars:

PYSPARK_PYTHON: /databricks/python3/bin/python3

enable_elastic_disk: true

data_security_mode: NONE

runtime_engine: PHOTON

autoscale:

min_workers: 1

max_workers: '${var.max_workers_instance}'

queue:

enabled: false

#asset_bundles_job3.yaml

resources:

jobs:

asset_bundles_job3:

description: >+

Extracts Data form Apis.

health:

rules:

- metric: RUN_DURATION_SECONDS

op: GREATER_THAN

value: 3600

schedule:

quartz_cron_expression: 0 0/15 * * * ?

timezone_id: UTC

pause_status: ${var.job_status}

max_concurrent_runs: 1

tasks:

- task_key: task1

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script1.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

tasks:

- task_key: task2

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script2.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

job_clusters:

- job_cluster_key: '${var.job_cluster_name}'

new_cluster:

spark_version: 15.4.x-scala2.12

spark_conf:

spark.databricks.repl.allowedLanguages: 'sql,python,r,scala'

spark.databricks.delta.retentionDurationCheck.enabled: false

spark.databricks.hive.metastore.glueCatalog.enabled: true

spark.hadoop.fs.s3a.acl.default: BucketOwnerFullControl

spark.hadoop.hive.metastore.glue.catalogid: '${var.glue_catalog}'

aws_attributes:

first_on_demand: 1

availability: SPOT_WITH_FALLBACK

zone_id: auto

instance_profile_arn: '${var.instance_profilearn}'

spot_bid_price_percent: 100

ebs_volume_count: 0

node_type_id: '${var.node_type}'

driver_node_type_id: '${var.node_type}'

spark_env_vars:

PYSPARK_PYTHON: /databricks/python3/bin/python3

enable_elastic_disk: true

data_security_mode: NONE

runtime_engine: PHOTON

autoscale:

min_workers: 1

max_workers: '${var.max_workers_instance}'

queue:

enabled: false

#databricks.yaml

bundle:

include:

- resources/*.yml

variables:

config_file:

description: Config file for Respective Enivroment

default: ../../../resources/config/dit.ini

config_file_a:

description: Config file for Respective Enivroment

default: ../../../resources/config/a_dit.ini

config_file_b:

description: Config file for Respective Enivroment

default: ../../../resources/config/b_dit.ini

config_region:

description: config region

default: regiona

glue_catalog:

description: Glue Catalogid Details

instance_profilearn:

description: Instance Profile Arn Details

instance_profilearn_a:

description: Instance Profile Arn Details

max_workers_instance:

description: Max workers

default: 2

node_type:

description: Ec2 Instance Type

default: r5d.xlarge

job_cluster_name:

description: Name of the Job Cluster

default: job_cluster

max_retries:

description: Max retries for the task

default: 1

intv_seconds:

description: Retry interval millis

default: 15000

job_status:

description: determines whether the jobs should be pause or unpaused by enviornment

default: PAUSED

targets:

dev:

variables:

config_file: ../../../resources/config/a_fit.ini

config_file_b: ../../../resources/config/b_dit.ini

glue_catalog: '123456'

instance_profilearn: >-

arn:aws:iam::123456:instance-profile/Databricks-role

instance_profilearn_wisely: >-

arn:aws:iam::123456:instance-profile/Databricks-role

max_workers_instance: 2

node_type: i3.2xlarge

job_cluster_name: '${bundle.name}_Job'

job_status: PAUSED

mode: development

default: true

workspace:

host: 'https://dev.cloud.databricks.com'

root_path: >-

/Users/test1@gmail.com/.bundle/${bundle.name}/${bundle.target}

run_as:

user_name: test1@gmail.com

resources:

jobs:

asset_bundles_job1:

permissions:

- user_name: test1@gmail.com

level: CAN_MANAGEs

email_notifications:

on_failure: [test1@gmail.com]

on_duration_warning_threshold_exceeded: [test1@gmail.com]

no_alert_for_skipped_runs: true

asset_bundles_job2:

permissions:

- user_name: test1@gmail.com

level: CAN_MANAGEs

email_notifications:

on_failure: [test1@gmail.com]

on_duration_warning_threshold_exceeded: [test1@gmail.com]

no_alert_for_skipped_runs: true

dev_ca:

config_file: ../../../resources/config/a_fit.ini

config_file_b: ../../../resources/config/b_dit.ini

glue_catalog: '123456'

instance_profilearn: >-

arn:aws:iam::123456:instance-profile/Databricks-role

instance_profilearn_wisely: >-

arn:aws:iam::123456:instance-profile/Databricks-role

max_workers_instance: 2

node_type: i3.2xlarge

job_cluster_name: '${bundle.name}_Job'

job_status: PAUSED

mode: development

default: true

workspace:

host: 'https://dev_ca.cloud.databricks.com'

root_path: >-

/Users/test1@gmail.com/.bundle/${bundle.name}/${bundle.target}

run_as:

user_name: test1@gmail.com

resources:

jobs:

asset_bundles_job2:

permissions:

- user_name: test1@gmail.com

level: CAN_MANAGEs

email_notifications:

on_failure: [test1@gmail.com]

on_duration_warning_threshold_exceeded: [test1@gmail.com]

no_alert_for_skipped_runs: true

example 2 :

#asset_bundles_job1.yaml

resources:

jobs:

asset_bundles_job1:

description: >+

Extracts Data form Apis.

health:

rules:

- metric: RUN_DURATION_SECONDS

op: GREATER_THAN

value: 3600

schedule:

quartz_cron_expression: 0 0/15 * * * ?

timezone_id: UTC

pause_status: ${var.job_status}

max_concurrent_runs: 1

tasks:

- task_key: task1

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script1.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

tasks:

- task_key: task2

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script2.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

job_clusters:

- job_cluster_key: '${var.job_cluster_name}'

new_cluster:

spark_version: 15.4.x-scala2.12

spark_conf:

spark.databricks.repl.allowedLanguages: 'sql,python,r,scala'

spark.databricks.delta.retentionDurationCheck.enabled: false

spark.databricks.hive.metastore.glueCatalog.enabled: true

spark.hadoop.fs.s3a.acl.default: BucketOwnerFullControl

spark.hadoop.hive.metastore.glue.catalogid: '${var.glue_catalog}'

aws_attributes:

first_on_demand: 1

availability: SPOT_WITH_FALLBACK

zone_id: auto

instance_profile_arn: '${var.instance_profilearn}'

spot_bid_price_percent: 100

ebs_volume_count: 0

node_type_id: '${var.node_type}'

driver_node_type_id: '${var.node_type}'

spark_env_vars:

PYSPARK_PYTHON: /databricks/python3/bin/python3

enable_elastic_disk: true

data_security_mode: NONE

runtime_engine: PHOTON

autoscale:

min_workers: 1

max_workers: '${var.max_workers_instance}'

queue:

enabled: false

#asset_bundles_job2.yaml

resources:

jobs:

asset_bundles_job2:

description: >+

Extracts Data form Apis.

health:

rules:

- metric: RUN_DURATION_SECONDS

op: GREATER_THAN

value: 3600

schedule:

quartz_cron_expression: 0 0/15 * * * ?

timezone_id: UTC

pause_status: ${var.job_status}

max_concurrent_runs: 1

tasks:

- task_key: task1

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script1.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

tasks:

- task_key: task2

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script2.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

job_clusters:

- job_cluster_key: '${var.job_cluster_name}'

new_cluster:

spark_version: 15.4.x-scala2.12

spark_conf:

spark.databricks.repl.allowedLanguages: 'sql,python,r,scala'

spark.databricks.delta.retentionDurationCheck.enabled: false

spark.databricks.hive.metastore.glueCatalog.enabled: true

spark.hadoop.fs.s3a.acl.default: BucketOwnerFullControl

spark.hadoop.hive.metastore.glue.catalogid: '${var.glue_catalog}'

aws_attributes:

first_on_demand: 1

availability: SPOT_WITH_FALLBACK

zone_id: auto

instance_profile_arn: '${var.instance_profilearn}'

spot_bid_price_percent: 100

ebs_volume_count: 0

node_type_id: '${var.node_type}'

driver_node_type_id: '${var.node_type}'

spark_env_vars:

PYSPARK_PYTHON: /databricks/python3/bin/python3

enable_elastic_disk: true

data_security_mode: NONE

runtime_engine: PHOTON

autoscale:

min_workers: 1

max_workers: '${var.max_workers_instance}'

queue:

enabled: false

#asset_bundles_job3.yaml

resources:

jobs:

asset_bundles_job3:

description: >+

Extracts Data form Apis.

health:

rules:

- metric: RUN_DURATION_SECONDS

op: GREATER_THAN

value: 3600

schedule:

quartz_cron_expression: 0 0/15 * * * ?

timezone_id: UTC

pause_status: ${var.job_status}

max_concurrent_runs: 1

tasks:

- task_key: task1

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script1.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

tasks:

- task_key: task2

job_cluster_key: '${var.job_cluster_name}'

notebook_task:

notebook_path: ../src/script2.py

base_parameters:

configFile: '${var.config_file}'

config_region: '${var.config_region}'

max_retries: '${var.max_retries}'

min_retry_interval_millis: '${var.intv_seconds}'

job_clusters:

- job_cluster_key: '${var.job_cluster_name}'

new_cluster:

spark_version: 15.4.x-scala2.12

spark_conf:

spark.databricks.repl.allowedLanguages: 'sql,python,r,scala'

spark.databricks.delta.retentionDurationCheck.enabled: false

spark.databricks.hive.metastore.glueCatalog.enabled: true

spark.hadoop.fs.s3a.acl.default: BucketOwnerFullControl

spark.hadoop.hive.metastore.glue.catalogid: '${var.glue_catalog}'

aws_attributes:

first_on_demand: 1

availability: SPOT_WITH_FALLBACK

zone_id: auto

instance_profile_arn: '${var.instance_profilearn}'

spot_bid_price_percent: 100

ebs_volume_count: 0

node_type_id: '${var.node_type}'

driver_node_type_id: '${var.node_type}'

spark_env_vars:

PYSPARK_PYTHON: /databricks/python3/bin/python3

enable_elastic_disk: true

data_security_mode: NONE

runtime_engine: PHOTON

autoscale:

min_workers: 1

max_workers: '${var.max_workers_instance}'

queue:

enabled: false

#databricks.yaml

bundle:

include:

- resources/asset_bundles_job1.yml

- resources/asset_bundles_job2.yml

- resources/asset_bundles_job3.yml

variables:

config_file:

description: Config file for Respective Enivroment

default: ../../../resources/config/dit.ini

config_file_a:

description: Config file for Respective Enivroment

default: ../../../resources/config/a_dit.ini

config_file_b:

description: Config file for Respective Enivroment

default: ../../../resources/config/b_dit.ini

config_region:

description: config region

default: regiona

glue_catalog:

description: Glue Catalogid Details

instance_profilearn:

description: Instance Profile Arn Details

instance_profilearn_a:

description: Instance Profile Arn Details

max_workers_instance:

description: Max workers

default: 2

node_type:

description: Ec2 Instance Type

default: r5d.xlarge

job_cluster_name:

description: Name of the Job Cluster

default: job_cluster

max_retries:

description: Max retries for the task

default: 1

intv_seconds:

description: Retry interval millis

default: 15000

job_status:

description: determines whether the jobs should be pause or unpaused by enviornment

default: PAUSED

targets:

dev:

variables:

config_file: ../../../resources/config/a_fit.ini

config_file_b: ../../../resources/config/b_dit.ini

glue_catalog: '123456'

instance_profilearn: >-

arn:aws:iam::123456:instance-profile/Databricks-role

instance_profilearn_wisely: >-

arn:aws:iam::123456:instance-profile/Databricks-role

max_workers_instance: 2

node_type: i3.2xlarge

job_cluster_name: '${bundle.name}_Job'

job_status: PAUSED

mode: development

default: true

workspace:

host: 'https://dev.cloud.databricks.com'

root_path: >-

/Users/test1@gmail.com/.bundle/${bundle.name}/${bundle.target}

run_as:

user_name: test1@gmail.com

dev_ca:

config_file: ../../../resources/config/a_fit.ini

config_file_b: ../../../resources/config/b_dit.ini

glue_catalog: '123456'

instance_profilearn: >-

arn:aws:iam::123456:instance-profile/Databricks-role

instance_profilearn_wisely: >-

arn:aws:iam::123456:instance-profile/Databricks-role

max_workers_instance: 2

node_type: i3.2xlarge

job_cluster_name: '${bundle.name}_Job'

job_status: PAUSED

mode: development

default: true

workspace:

host: 'https://dev_ca.cloud.databricks.com'

root_path: >-

/Users/test1@gmail.com/.bundle/${bundle.name}/${bundle.target}

run_as:

user_name: test1@gmail.com

resources:

jobs:

asset_bundles_job1:

deployments: [dev]

permissions:

- user_name: test1@gmail.com

level: CAN_MANAGEs

email_notifications:

on_failure: [test1@gmail.com]

on_duration_warning_threshold_exceeded: [test1@gmail.com]

no_alert_for_skipped_runs: true

asset_bundles_job2:

deployments: [dev]

permissions:

- user_name: test1@gmail.com

level: CAN_MANAGEs

email_notifications:

on_failure: [test1@gmail.com]

on_duration_warning_threshold_exceeded: [test1@gmail.com]

no_alert_for_skipped_runs: true

asset_bundles_job2:

deployments: [dev_ca]

permissions:

- user_name: test1@gmail.com

level: CAN_MANAGEs

email_notifications:

on_failure: [test1@gmail.com]

on_duration_warning_threshold_exceeded: [test1@gmail.com]

no_alert_for_skipped_runs: true

mark_ott · ‎10-31-2025

The issue you’re facing—where all Databricks Asset Bundle jobs are being deployed to all targets instead of only the defined target(s)—appears to be a known limitation in how the bundle resource inclusion and target mapping works in the Databricks CLI, especially with YAML structure from templates like default-python and when using top-level resource includes.

Why This Happens

When you use the include: directive in your databricks.yaml, all resource files matched by the glob pattern (e.g., resources/*.yml) are merged into every target in the bundle.
This means every resource (job, cluster, etc.) that is part of the bundle’s include expression ends up being available in every target, unless you explicitly restructure your YAML to scope resources by target.
As a result, databricks bundle deploy --target dev (or any target) will deploy all jobs included, rather than limiting to just the ones you want.
There is currently no official CLI flag or YAML key like deployments: [target] that allows you to restrict the deployment of individual assets/jobs to specific targets only—every included resource ends up in every target automatically.

Workarounds and Best Practices

1. Move Resources Under Target Blocks

The most robust solution is to avoid only using global includes and, instead, define your jobs or resources within the appropriate targets: section. Only resources defined within a target will be deployed for that target.

Example:

text

targets:
  dev:
    resources:
      jobs:
        asset_bundles_job1:
          # job1 definition here
        asset_bundles_job2:
          # job2 definition here
  dev_ca:
    resources:
      jobs:
        asset_bundles_job3:
          # job3 definition here

This way, only job1 and job2 deploy to dev, only job3 deploys to dev_ca.

2. Use Separate Bundles for Different Environments

If jobs differ substantially between environments, you can maintain multiple bundle YAMLs (one per environment), or as sub-bundles, and avoid broad includes. This can be scripted at build/deploy time.

3. Manually Refactor Resource Includes

Instead of a glob pattern like include: resources/*.yml, explicitly include only the resource YAMLs you want for each target. This is less maintainable at scale, but is effective for small setups.

4. Monitor the Databricks CLI GitHub Issues

There is an open issue discussing adding support to exclude global resources for certain targets. Keep track of this for future enhancements.

Current Limitation

As of Databricks CLI v0.240.0 and the documented June/October 2025 CLI behavior, resource inclusion is always global unless restructured as above.

In summary:

Move target-specific jobs under each target inside databricks.yaml, or use separate includes for each target instead of a global pattern.
There is no native way to limit deployment of individual jobs to some targets unless jobs are housed within those targets’ blocks in the configuration.
Watch for updates addressing exclusion/inclusion per target, as this remains a feature request.

For your use case, restructuring your databricks.yaml so that each target defines only the jobs it should receive is the recommended workaround for now.