cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

DABs, policies and cluster pools

tbailey
New Contributor II

My scenario,

A policy called 'Job Pool', which has the following overrides:

"instance_pool_id": {
"type": "unlimited",
"hidden": true
},
"driver_instance_pool_id": {
"type": "unlimited",
"hidden": true
}

I have an asset bundle that sets a new cluster as follows:

new_cluster:
  apply_policy_default_values: true
  policy_id: ${var.job_compute_policy_id}
  driver_instance_pool_id: ${var.on_demand_driver_pool_id}
  instance_pool_id: ${var.on_demand_instance_pool_id}
  autoscale:
    min_workers: 1
    max_workers: 8

 

When trying to deploy the bundle, I get the following error:

Error: cannot update job: Cluster validation error: Validation failed for azure_attributes.spot_bid_max_price from worker pool, the value must be present; Validation failed for azure_attributes.spot_bid_max_price from driver pool, the value must be present

This is despite both being on-demand pools. 

If I change the worker pool to a spot pool and add the following override:

"azure_attributes.spot_bid_max_price": {
"type": "unlimited",
"defaultValue": 100
}

I get a similar error, but without reference to the worker pool:

Error: cannot update job: Cluster validation error: Validation failed for azure_attributes.spot_bid_max_price from driver pool, the value must be present

I have tried setting both driver and worker to the same on demand pool, even when setting variations including 'unlimited' and setting value to '100'

 "azure_attributes.spot_bid_max_price": {
    "type": "fixed",
    "value": -1
  }

The deployment still fails asking for spot_bid pricing.

Error: cannot update job: Cluster validation error: Validation failed for azure_attributes.spot_bid_max_price from pool, the value must be present

The only thing that seems to work is deploying both to a spot-instance pool, which I don't want to do because my driver is

Questions are:

- How can I assign the driver to an on-demand pool and the workers to a spot instance in a DAB resource definition without generating this error, the docs don't show pool examples (please link me it if I'm wrong)?

- How can I assign both to the same on demand pool if the above is not possible?

- Where are your docs on working with pools & policies for DABs?

 

1 REPLY 1

tbailey
New Contributor II

Update

Further poking suggests for questions (1) Mixed driver on-demand vs worker spot and (2) Both on demand; they both require a custom policy that removes 

azure_attributes.spot_bid_max_price{}

From the definition.

I tried to see if I could then set the spot % to something random like 23

new_cluster:
            apply_policy_default_values: true
            policy_id: ${var.job_compute_policy_id}
            driver_instance_pool_id: ${var.driver_instance_pool_id}
            instance_pool_id: ${var.instance_pool_id}i
            azure_attributes:
              spot_bid_max_price: 23
            autoscale: 
              min_workers: 1
              max_workers: 8
            

This didn't seem to work, including putting spot_bid_max_price inside a curly brace block - at least the job definition in the UI showed azure_attributes as blank.  I assume it defaults to 100.

So outstanding questions are:

1) Setting the spot_bid_max_price for the spot pool

2) Any docs on this scenario.

 

 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now