cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 

DABs, policies and cluster pools

tbailey
New Contributor II

My scenario,

A policy called 'Job Pool', which has the following overrides:

"instance_pool_id": {
"type": "unlimited",
"hidden": true
},
"driver_instance_pool_id": {
"type": "unlimited",
"hidden": true
}

I have an asset bundle that sets a new cluster as follows:

new_cluster:
  apply_policy_default_values: true
  policy_id: ${var.job_compute_policy_id}
  driver_instance_pool_id: ${var.on_demand_driver_pool_id}
  instance_pool_id: ${var.on_demand_instance_pool_id}
  autoscale:
    min_workers: 1
    max_workers: 8

 

When trying to deploy the bundle, I get the following error:

Error: cannot update job: Cluster validation error: Validation failed for azure_attributes.spot_bid_max_price from worker pool, the value must be present; Validation failed for azure_attributes.spot_bid_max_price from driver pool, the value must be present

This is despite both being on-demand pools. 

If I change the worker pool to a spot pool and add the following override:

"azure_attributes.spot_bid_max_price": {
"type": "unlimited",
"defaultValue": 100
}

I get a similar error, but without reference to the worker pool:

Error: cannot update job: Cluster validation error: Validation failed for azure_attributes.spot_bid_max_price from driver pool, the value must be present

I have tried setting both driver and worker to the same on demand pool, even when setting variations including 'unlimited' and setting value to '100'

 "azure_attributes.spot_bid_max_price": {
    "type": "fixed",
    "value": -1
  }

The deployment still fails asking for spot_bid pricing.

Error: cannot update job: Cluster validation error: Validation failed for azure_attributes.spot_bid_max_price from pool, the value must be present

The only thing that seems to work is deploying both to a spot-instance pool, which I don't want to do because my driver is

Questions are:

- How can I assign the driver to an on-demand pool and the workers to a spot instance in a DAB resource definition without generating this error, the docs don't show pool examples (please link me it if I'm wrong)?

- How can I assign both to the same on demand pool if the above is not possible?

- Where are your docs on working with pools & policies for DABs?

 

3 REPLIES 3

tbailey
New Contributor II

Update

Further poking suggests for questions (1) Mixed driver on-demand vs worker spot and (2) Both on demand; they both require a custom policy that removes 

azure_attributes.spot_bid_max_price{}

From the definition.

I tried to see if I could then set the spot % to something random like 23

new_cluster:
            apply_policy_default_values: true
            policy_id: ${var.job_compute_policy_id}
            driver_instance_pool_id: ${var.driver_instance_pool_id}
            instance_pool_id: ${var.instance_pool_id}i
            azure_attributes:
              spot_bid_max_price: 23
            autoscale: 
              min_workers: 1
              max_workers: 8
            

This didn't seem to work, including putting spot_bid_max_price inside a curly brace block - at least the job definition in the UI showed azure_attributes as blank.  I assume it defaults to 100.

So outstanding questions are:

1) Setting the spot_bid_max_price for the spot pool

2) Any docs on this scenario.

 

 

-werners-
Esteemed Contributor III

I have a similar issue. Also with the bid price.
It seems that the databricks API/DAB does not take the correct values in case of mixed clusters (driver/workers).
The funny part is that this only occurs when redeploying a dab, not the initial create.
So a destroy and deploy might work.  Which is not cool because this will change the job id.

mark_ott
Databricks Employee
Databricks Employee

You are experiencing validation errors assigning a driver to an on-demand pool and workers to a spot pool in your Databricks Asset Bundle (DAB) configuration because the 'spot_bid_max_price' attribute is being forced by policies—even when the pools are on-demand and the attribute should only be relevant for spot clusters. This is a common issue, especially when policies include 'azure_attributes.spot_bid_max_price' overrides or fail to distinguish between driver and worker configurations.​

Assigning Mixed Driver/Worker Pools

  • To set the driver on an on-demand pool and workers on a spot pool in DAB:

    • Create two separate instance pools: one on-demand, one spot.

    • Your DAB cluster definition should specify driver_instance_pool_id for the driver's on-demand pool and instance_pool_id for the workers' spot pool, as you are.

    • In your policy:

      • Remove all azure_attributes.spot_bid_max_price overrides if you want to allow on-demand (not spot) for the driver.

      • Spot price (spot_bid_max_price) only needs to be present when a spot pool is used. If you apply the policy to both nodes and keep that attribute, Databricks expects the value even for on-demand pools, hence the error.

      • Best practice is to clone and customize your cluster policy, removing inherited spot price requirements for on-demand pools.​

  • Documentation showing example pool usage in policies is limited, but the official reference for asset bundle resource types explains this setup generally (see "Supported resource types for bundles" in Databricks docs).​

Key Policy Considerations

  • Do not set spot_bid_max_price for on-demand pools.

  • If the pool configuration or cluster policy inherits spot_bid_max_price, you need to remove it for policies used with on-demand pools.

  • If the workers are spot, set spot_bid_max_price only for their pool, not the driver's pool.

  • Many users solved validation failures by customizing the policy and removing or correctly scoping spot_bid_max_price.​

Helpful Documentation

Practical Steps

  • Remove azure_attributes.spot_bid_max_price from the cluster policy if using on-demand pools. Only include it when the pool is a spot pool.

  • If you need a mixed setup, the driver must not inherit spot pricing requirements from worker pool policy overrides.

  • Always validate policies after modification.

  • If inherited policy cannot be edited directly, clone and customize as advised above.​

If you need explicit YAML or JSON examples or further clarification, let me know, and customized templates can be provided. For official documentation with pool examples: Databricks doesn't currently offer in-depth example YAML directly for every pool scenario, but you can reference the cluster policy and asset bundle resource guides above.​