Creating a spot only single-node job compute cluster policy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2023 09:45 AM
Hi there,
I need some help creating a new cluster policy that utilizes a single spot-instnace server to complete a job. I want to set this up as a job-compute to reduce costs and also utilize 1 spot instance.
The jobs I need to ETL are very short and complete within a few minutes and I don't think it's wise to spend 2 DBU's on something when 1DBU would suffice.
Thank you in advance for your help!
K
- Labels:
-
Cluster
-
Dbu
-
Spot instances
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2023 10:20 AM
Below is the required policy. Spot instances you need to define inside the pool, that's why I included reference to pool below.
{
"cluster_type":{
"type":"fixed",
"value":"job"
},
"spark_conf.spark.databricks.cluster.profile":{
"type":"fixed",
"value":"singleNode",
"hidden":true
},
"instance_pool_id":{
"type":"fixed",
"value":"singleNodePoolId1",
"hidden":true
},
"num_workers":{
"type":"range",
"maxValue":0
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2023 10:59 AM
Hi there,
Thank you for the quick reply. I'm looking to create a policy not for a pool but for any job in the workflow.
Here is the current policy I am playing with. Please let me know if you see where this is off.
{
"spark_conf.spark.databricks.cluster.profile":{
"type":"fixed",
"value":"singleNode",
"hidden":true
},
"spark_version": {
"type": "unlimited",
"defaultValue": "auto:latest-lts"
},
"enable_elastic_disk": {
"type": "fixed",
"value": true,
"hidden": true
},
"node_type_id": {
"type": "unlimited",
"defaultValue": "i3.xlarge",
"isOptional": true
},
"num_workers" : {
"type" : "fixed",
"value" : 0,
"hidden" : true
},
"aws_attributes.availability": {
"type": "fixed",
"value": "SPOT",
"hidden": true
},
"aws_attributes.zone_id": {
"type": "unlimited",
"defaultValue": "auto",
"hidden": true
},
"aws_attributes.spot_bid_price_percent": {
"type": "fixed",
"value": 100,
"hidden": true
},
"instance_pool_id": {
"type": "forbidden",
"hidden": true
},
"driver_instance_pool_id": {
"type": "forbidden",
"hidden": true
},
"cluster_type": {
"type": "fixed",
"value": "job"
}
}
P.S. When I copy your code into the policy maker it says singleNodePoolId1 does not exist.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-25-2023 12:53 AM
This is the policy for the job, but if you want to use spot instances first, you need to create a pool with spot instance. singleNodePoolId1 is just an example name. Just create a pool spot with 1 machine, name it how you want, and put your name in JSON.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-24-2023 03:41 PM
Hi @Avkash Kana,
Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

